-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support enzyme in KA #2260
Support enzyme in KA #2260
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -243,4 +243,10 @@ function KA.priority!(::CUDABackend, prio::Symbol) | |
return nothing | ||
end | ||
|
||
KA.supports_enzyme(::CUDABackend) = true | ||
function KA.__fake_compiler_job(::CUDABackend) | ||
mi = CUDA.methodinstance(typeof(()->return), Tuple{}) | ||
return CUDA.CompilerJob(mi, CUDA.compiler_config(CUDA.device())) | ||
end | ||
Comment on lines
+247
to
+250
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is very sketchy... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the core challenge we have is that we need to allocate memory of a certain type for the tape. This memory allocation needs to occur on the outside and then be passed into the kernel. What we came up with was a reflection function The crux is that we can't use the host job for the reflection since in the end this is a deferred compilation. This requires us taking into account the CUDA method table. So for this reflection we need the parent job. Not the real one as long as the method table matches, also the parent kernel will take the allocated array as an argument so we can't even construct it yet. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Part of the challenge here is that only the backend packages know how to construct an appropriate job. We also need to do something similar to support reverse mode for CUDA.jl directly. |
||
|
||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems weird/wrong to add an Enzyme-specific API to the KA.jl interface, while the Enzyme support is all in an extension packages.