Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Batch jacobi device kernels #1600

Merged
merged 14 commits into from
May 29, 2024
Merged

Add Batch jacobi device kernels #1600

merged 14 commits into from
May 29, 2024

Conversation

pratikvn
Copy link
Member

This PR adds the device kernels for batch Jacobi preconditioner.

@pratikvn pratikvn added the type:batched-functionality This is related to the batched functionality in Ginkgo label Apr 18, 2024
@pratikvn pratikvn self-assigned this Apr 18, 2024
@ginkgo-bot ginkgo-bot added reg:build This is related to the build system. mod:cuda This is related to the CUDA module. type:preconditioner This is related to the preconditioners mod:hip This is related to the HIP module. labels Apr 18, 2024
@MarcelKoch MarcelKoch added this to the Ginkgo 1.8.0 milestone Apr 19, 2024
@pratikvn pratikvn force-pushed the batch-jacobi-device branch 2 times, most recently from 96c2138 to 5d26c0e Compare April 26, 2024 12:48
@pratikvn pratikvn changed the title WIP: Batch jacobi device kernels Add Batch jacobi device kernels Apr 26, 2024
@pratikvn pratikvn added the 1:ST:ready-for-review This PR is ready for review label Apr 26, 2024
@pratikvn pratikvn requested review from a team April 26, 2024 12:50
@pratikvn pratikvn force-pushed the batch-jacobi-device branch 3 times, most recently from 810bc89 to c5c918e Compare April 30, 2024 08:26
@pratikvn pratikvn force-pushed the batch-prec-jacobi branch 2 times, most recently from 32de29b to fcff54f Compare May 6, 2024 22:59
Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Mostly minor comments regarding consistency left.

One thing I've noticed is that some of the functions in the device preconditioner classes explicitly synchronize at the end, while others don't. Maybe it would be best to find a consistent behavior. I think I would prefer to not synchronize in the functions, but instead require synchronization after the function call.

test/preconditioner/batch_jacobi_kernels.cpp Outdated Show resolved Hide resolved
test/preconditioner/batch_jacobi_kernels.cpp Outdated Show resolved Hide resolved
test/preconditioner/batch_jacobi_kernels.cpp Outdated Show resolved Hide resolved
test/preconditioner/batch_jacobi_kernels.cpp Outdated Show resolved Hide resolved
common/cuda_hip/preconditioner/batch_scalar_jacobi.hpp.inc Outdated Show resolved Hide resolved
dpcpp/preconditioner/batch_preconditioners.hpp Outdated Show resolved Hide resolved
common/cuda_hip/preconditioner/batch_block_jacobi.hpp.inc Outdated Show resolved Hide resolved
Comment on lines +130 to +131
// constexpr int subwarp_size =
// gko::kernels::dpcpp::jacobi::get_larger_power(compiled_max_block_size);
// TODO: Find a way to allow smaller block_sizes (<16)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sounds like the whole kernel selection approach is unnecessary here. So maybe remove the template parameter for now and add it later.

dpcpp/preconditioner/batch_jacobi_kernels.hpp.inc Outdated Show resolved Hide resolved
Base automatically changed from batch-prec-jacobi to develop May 9, 2024 16:53
@pratikvn pratikvn requested a review from MarcelKoch May 9, 2024 20:42
@pratikvn pratikvn added the 1:ST:no-changelog-entry Skip the wiki check for changelog update label May 9, 2024
Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one major thing left: dealing with d value in invert_dense_block are different between sycl and cuda.

dpcpp/preconditioner/batch_jacobi_kernels.dp.cpp Outdated Show resolved Hide resolved
dpcpp/preconditioner/batch_jacobi_kernels.dp.cpp Outdated Show resolved Hide resolved
dpcpp/preconditioner/batch_jacobi_kernels.dp.cpp Outdated Show resolved Hide resolved
Comment on lines +48 to +49
common_generate_for_all_system_matrix_types(batch_id);
item_ct1.barrier(sycl::access::fence_space::local_space);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the barrier should take quite some overhead.
If it is unnecessary, maybe remove it.

@@ -257,7 +249,7 @@ TEST_F(BatchJacobi, CanSolveLargeMatrixSizeHpdSystemWithScalarJacobi)
for (size_t i = 0; i < num_batch_items; i++) {
auto comp_res_norm = res.host_res_norm->get_const_values()[i] /
linear_system.host_rhs_norm->get_const_values()[i];
ASSERT_LE(iter_counts->get_const_data()[i], max_iters);
EXPECT_LT(iter_counts->get_const_data()[i], max_iters);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it possbly equal, maybe use EXPECT_LE not EXPECT_LT?

@pratikvn pratikvn requested a review from yhmtsai May 13, 2024 13:35
@pratikvn
Copy link
Member Author

Will merge this only once #1396 has been merged.

@pratikvn pratikvn added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels May 13, 2024
@pratikvn pratikvn force-pushed the batch-jacobi-device branch 3 times, most recently from 973c77b to 37e71d8 Compare May 16, 2024 21:20
@tcojean
Copy link
Member

tcojean commented May 22, 2024

@pratikvn I will check if I can find the issue with the SYCL backend. If I'm not successful, we can just not merge that.

Copy link

codecov bot commented May 28, 2024

Codecov Report

Attention: Patch coverage is 84.90566% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 89.28%. Comparing base (1fc3d7d) to head (d7bc39e).
Report is 13 commits behind head on develop.

Current head d7bc39e differs from pull request most recent head 691a479

Please upload reports for the commit 691a479 to get more accurate results.

Files Patch % Lines
reference/preconditioner/batch_scalar_jacobi.hpp 40.00% 6 Missing ⚠️
reference/preconditioner/batch_block_jacobi.hpp 66.66% 1 Missing ⚠️
test/preconditioner/batch_jacobi_kernels.cpp 97.22% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1600      +/-   ##
===========================================
- Coverage    89.97%   89.28%   -0.70%     
===========================================
  Files          752      752              
  Lines        60445    60467      +22     
===========================================
- Hits         54387    53985     -402     
- Misses        6058     6482     +424     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pratikvn and others added 14 commits May 29, 2024 08:58
Co-authored-by: Isha Aggarwal <[email protected]>
Co-authored-by: Aditya Kashi <[email protected]>
Co-authored-by: Phuong Nguyen <[email protected]>
Co-authored-by: Marcel Koch <[email protected]>
Co-authored-by: Marcel Koch <[email protected]>
Co-authored-by: Yu-Hsiang Tsai <[email protected]>
The failure is fairly minimal as it happens only on some integrated GPUs
(some Gen 11). Nonetheless, similarly to the DPC++ non-batch Jacobi kernels,
this indicates a bigger investigation and fix of these kernels is
necessary.
@tcojean tcojean merged commit 1547509 into develop May 29, 2024
11 of 15 checks passed
@tcojean tcojean deleted the batch-jacobi-device branch May 29, 2024 12:05
Copy link

sonarcloud bot commented May 29, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:no-changelog-entry Skip the wiki check for changelog update 1:ST:ready-to-merge This PR is ready to merge. mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. reg:build This is related to the build system. type:batched-functionality This is related to the batched functionality in Ginkgo type:preconditioner This is related to the preconditioners
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants