Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proof of Concept] TrixiParticles.jl on GPUs with KernelAbstractions.jl #474

Closed
wants to merge 7 commits into from

Conversation

efaulhaber
Copy link
Member

@efaulhaber efaulhaber commented Mar 20, 2024

Like #472, but now with KernelAbstractions.jl. By using the correct array types, we should thus be able to run the exact same code on AMD, Intel or even Apple GPUs.
This PR required minimal changes from #472. See efaulhaber#1 for a diff.

It only works for problem sizes smaller than the GPU (#particles < #threads of the GPU). This is because I just set the number of threads to the number of particles instead of using blocks.
This is now solved here. As opposed to CUDA.jl, KernelAbstractions.jl is choosing thread number and block size automatically. Maybe CUDA.jl can do the same, idk.

The big advantage now is that we can use problem sizes larger than the GPU, and there it's actually fast!
Here is a comparison between the M2 Pro CPU in my laptop (the Thredripper workstation is currently busy) and an RTX 3090:
gpu1

@efaulhaber efaulhaber mentioned this pull request Apr 2, 2024
10 tasks
@efaulhaber efaulhaber added the gpu label Apr 2, 2024
@efaulhaber efaulhaber changed the title Proof of Concept: TrixiParticles.jl on GPUs with KernelAbstractions.jl [Proof of Concept] TrixiParticles.jl on GPUs with KernelAbstractions.jl Apr 2, 2024
@efaulhaber
Copy link
Member Author

Superseded by PRs in #484.

@efaulhaber efaulhaber closed this Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant