Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare for more CUDA/HIP unification #1616

Merged
merged 8 commits into from
Jun 27, 2024
Merged

Prepare for more CUDA/HIP unification #1616

merged 8 commits into from
Jun 27, 2024

Conversation

upsj
Copy link
Member

@upsj upsj commented May 20, 2024

To simplify the review of #1516, this does all the necessary changes to automate the majority of the other changes

  • Add necessary switching headers
  • Provide device namespace macro via compiler definitions
  • Add necessary (namespace) aliases
  • adapt math lib includes and namespaces
  • uniformize files

@upsj upsj requested a review from a team May 20, 2024 06:32
@ginkgo-bot ginkgo-bot added reg:build This is related to the build system. reg:testing This is related to testing. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:openmp This is related to the OpenMP module. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners type:matrix-format This is related to the Matrix formats mod:hip This is related to the HIP module. type:factorization This is related to the Factorizations type:multigrid This is related to multigrid mod:dpcpp This is related to the DPC++ module. labels May 20, 2024
@upsj upsj self-assigned this May 20, 2024
@upsj upsj added the 1:ST:ready-for-review This PR is ready for review label May 20, 2024
@upsj upsj force-pushed the prepare-unification branch 3 times, most recently from edc816d to c96101e Compare May 22, 2024 12:04
@MarcelKoch MarcelKoch self-requested a review June 20, 2024 14:05
Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, looks good to me. I've left some remarks, which are mostly quite minor.

The only thing I would like to see changed is to revert the header ordering of the hip runtime header. I think it is already causing compilation issues.

Another question is about the as_cuda_type, and as_hip_type. Should they also be unified?

common/cuda_hip/base/blas_bindings.hpp Outdated Show resolved Hide resolved
common/cuda_hip/base/runtime.hpp Show resolved Hide resolved
cuda/base/cublas_bindings.hpp Show resolved Hide resolved
cuda/base/cusparse_bindings.hpp Show resolved Hide resolved
cuda/base/types.hpp Outdated Show resolved Hide resolved
cuda/factorization/par_ilu_kernels.cu Outdated Show resolved Hide resolved
cuda/factorization/par_ilut_filter_kernels.cu Show resolved Hide resolved
hip/CMakeLists.txt Outdated Show resolved Hide resolved
hip/base/config.hip.hpp Show resolved Hide resolved
include/ginkgo/core/base/executor.hpp Outdated Show resolved Hide resolved
hip/CMakeLists.txt Outdated Show resolved Hide resolved
omp/CMakeLists.txt Outdated Show resolved Hide resolved
hip/CMakeLists.txt Outdated Show resolved Hide resolved
@upsj
Copy link
Member Author

upsj commented Jun 24, 2024

@MarcelKoch as_cuda/hip_type get replaced by the automatic script, so I didn't include the changes here

Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. when changing <hip/hip_runtime.h> to "common/cuda_hip/base/runtime.hpp", the header order should be also considered.
  2. because the files are not merged together in this pr, some hip check in cuda files or some cuda check in hip files should be moved to another pr
  3. EXEC_NAMESPACE -> GKO_DEVICE_NAMESPACE is not necessary

Also, if the code path are quite different between cuda/hip, I will prefer the file-based not the macro to distinguish for readibility.

cuda/matrix/ell_kernels.cu Show resolved Hide resolved
cuda/matrix/fbcsr_kernels.template.cu Show resolved Hide resolved
cuda/matrix/sparsity_csr_kernels.cu Show resolved Hide resolved
cuda/preconditioner/jacobi_kernels.cu Show resolved Hide resolved
cuda/solver/cb_gmres_kernels.cu Show resolved Hide resolved
hip/preconditioner/jacobi_kernels.hip.cpp Show resolved Hide resolved
include/ginkgo/core/base/executor.hpp Outdated Show resolved Hide resolved
test/base/index_range.cpp Show resolved Hide resolved
common/cuda_hip/base/runtime.hpp Show resolved Hide resolved
cuda/base/types.hpp Show resolved Hide resolved
@upsj upsj requested a review from yhmtsai June 26, 2024 16:13
Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the code self, only some macro can still use the unified one.
because
But I have some concerns:
Will it make the header more fragment? need to jump many levels and figure out what it is. Also, the distinguishment of two implementations is based on compile flag, so the editor may not find the correct one or always show both. (my editor show pointer_gaurd both implementation but can not jump for blas::gemm)
another concern is that it may destroy iwyu although we do not really follow this rule in all codes seriously.

cuda/base/cusparse_bindings.hpp Show resolved Hide resolved
cuda/solver/common_trs_kernels.cuh Outdated Show resolved Hide resolved
hip/base/hipsparse_bindings.hip.hpp Show resolved Hide resolved
hip/solver/common_trs_kernels.hip.hpp Outdated Show resolved Hide resolved
hip/solver/common_trs_kernels.hip.hpp Outdated Show resolved Hide resolved
upsj and others added 6 commits June 27, 2024 16:47
- Add necessary switching headers
- Provide device namespace macro via compiler definitions
- Add necessary (namespace) aliases
- adapt math lib includes and namespaces
- uniformize files
- fix HIP compilation issues
- uniform ifdef checks
- deviceComplex type aliases
- remove unnecessary includes

Co-authored-by: Marcel Koch <[email protected]>
- make sparselib/blas the only non-deprecated way of getting handles
- fix header orders

Co-authored-by: Yuhsiang M. Tsai <[email protected]>
@upsj upsj added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Jun 27, 2024
@upsj upsj added the 1:ST:no-changelog-entry Skip the wiki check for changelog update label Jun 27, 2024
@upsj upsj merged commit becfd87 into develop Jun 27, 2024
12 of 14 checks passed
@upsj upsj deleted the prepare-unification branch June 27, 2024 20:41
Copy link

sonarcloud bot commented Jun 28, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
47.4% Duplication on New Code (required ≤ 20%)

See analysis details on SonarCloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:no-changelog-entry Skip the wiki check for changelog update 1:ST:ready-to-merge This PR is ready to merge. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:dpcpp This is related to the DPC++ module. mod:hip This is related to the HIP module. mod:openmp This is related to the OpenMP module. reg:build This is related to the build system. reg:testing This is related to testing. type:factorization This is related to the Factorizations type:matrix-format This is related to the Matrix formats type:multigrid This is related to multigrid type:preconditioner This is related to the preconditioners type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants