You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are interested in adding support for DGL to run on AMD GPUs. This was previously requested by users in #2659. We can make use of the hipify tooling.
I've already created a prototype that passes almost all the C++ unit tests (there's some missing functionality in HIP/ROCM upstream that will need to be addressed) and runs through all the blitz tutorials. I have only experimented with the PyTorch backend. PyTorch already calls all GPUs "cuda", so existing torch python code doesn't need to be modified. Within DGL, the prototype follows this same pattern of overloading the cuda types.
Structure
The prototype just converts the code in-place, modifying it to use HIP instead of CUDA. Presumably, you don't want to do that. So options would be:
the HIP version lives on a separate branch, constructed by "hipifying" the default branch
the HIP version is checked-in to a parallel directory structure (mostly generated by hipifying the main source code)
the HIP version is generated dynamically by a build script (this is what PyTorch does).
HIP is added as a fully-supported device type separate from CUDA. This would be a lot more work, I think, but could be a good option long term if this gains traction.
Barring strong reasons to the contrary, I think following PyTorch's example (3) probably makes the most sense.
In addition to threading through the appropriate build options, the prototype makes a few changes to the source code prior to hipification. I think they are (or can be made to be) relatively unobjectionable, or at least hidden behind macros so they can't affect the normal build. If those changes aren't acceptable though, then it makes structure option 3 above trickier. It's also possible that achieving high performance (as opposed to just correctness) on AMD GPUs would require more invasive modifications. I think it's probably best to address those as they come up, but want to acknowledge that this isn't zero-cost from a maintainability perspective and might end up creating conflicting pressures.
The text was updated successfully, but these errors were encountered:
We are interested in adding support for DGL to run on AMD GPUs. This was previously requested by users in #2659. We can make use of the hipify tooling.
I've already created a prototype that passes almost all the C++ unit tests (there's some missing functionality in HIP/ROCM upstream that will need to be addressed) and runs through all the blitz tutorials. I have only experimented with the PyTorch backend. PyTorch already calls all GPUs "cuda", so existing torch python code doesn't need to be modified. Within DGL, the prototype follows this same pattern of overloading the cuda types.
Structure
The prototype just converts the code in-place, modifying it to use HIP instead of CUDA. Presumably, you don't want to do that. So options would be:
Barring strong reasons to the contrary, I think following PyTorch's example (3) probably makes the most sense.
In addition to threading through the appropriate build options, the prototype makes a few changes to the source code prior to hipification. I think they are (or can be made to be) relatively unobjectionable, or at least hidden behind macros so they can't affect the normal build. If those changes aren't acceptable though, then it makes structure option 3 above trickier. It's also possible that achieving high performance (as opposed to just correctness) on AMD GPUs would require more invasive modifications. I think it's probably best to address those as they come up, but want to acknowledge that this isn't zero-cost from a maintainability perspective and might end up creating conflicting pressures.
The text was updated successfully, but these errors were encountered: