-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare for more CUDA/HIP unification #1616
Conversation
edc816d
to
c96101e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, looks good to me. I've left some remarks, which are mostly quite minor.
The only thing I would like to see changed is to revert the header ordering of the hip runtime header. I think it is already causing compilation issues.
Another question is about the as_cuda_type
, and as_hip_type
. Should they also be unified?
36f3342
to
888aa89
Compare
@MarcelKoch as_cuda/hip_type get replaced by the automatic script, so I didn't include the changes here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- when changing
<hip/hip_runtime.h>
to"common/cuda_hip/base/runtime.hpp"
, the header order should be also considered. - because the files are not merged together in this pr, some hip check in cuda files or some cuda check in hip files should be moved to another pr
EXEC_NAMESPACE
->GKO_DEVICE_NAMESPACE
is not necessary
Also, if the code path are quite different between cuda/hip, I will prefer the file-based not the macro to distinguish for readibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the code self, only some macro can still use the unified one.
because
But I have some concerns:
Will it make the header more fragment? need to jump many levels and figure out what it is. Also, the distinguishment of two implementations is based on compile flag, so the editor may not find the correct one or always show both. (my editor show pointer_gaurd both implementation but can not jump for blas::gemm)
another concern is that it may destroy iwyu although we do not really follow this rule in all codes seriously.
- Add necessary switching headers - Provide device namespace macro via compiler definitions - Add necessary (namespace) aliases - adapt math lib includes and namespaces - uniformize files
- fix HIP compilation issues - uniform ifdef checks - deviceComplex type aliases - remove unnecessary includes Co-authored-by: Marcel Koch <[email protected]>
- make sparselib/blas the only non-deprecated way of getting handles - fix header orders Co-authored-by: Yuhsiang M. Tsai <[email protected]>
d239da7
to
45505ea
Compare
45505ea
to
7f7ce32
Compare
Quality Gate failedFailed conditions |
To simplify the review of #1516, this does all the necessary changes to automate the majority of the other changes