-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tracking] ROCm packages #197885
Comments
Updating to 5.3.1, marking all WIP until pushed to their respective PRs and verified. |
|
Hi, thanks a lot for your work on ROCm packages! So far, the updates where all aggregated in a single
tl;dr, do you mind merging all your 5.3.1 updates into a single PR? PS: Not sure how you did the update, I usually do it with |
I was actually afraid of the opposite being true so I split them up. |
Hip I think should stay separate though, since there are other changes. |
ROCm 6.0.0 has been released. |
Interesting - now pytorch works for me, but it doesn't seem to work correctly. I'm trying to generate an image from sdxl+lora with diffusers, and it generates an incorrect image... I tried identical code and model with manually defined seeds in google colab with cuda - it works there. Also seems to work locally on cpu with f32 types. (or it might be some problem in one of the libs, since locally I use all python libs from nix) |
The |
Hey, giving this a try. Still very much WIP, but it's working so far for my current project. |
@Madouura First, thanks for all your work on this front. You left a comment to the effect that rocBLASLt is "Very broken with Tensile at the moment, only supports GFX9". It looks like other platforms might be supported now, but I wondered if you might be able to elaborate with the "very broken with Tensile" part. I notice that they ship a vendored "Tensilelite", was that what you were trying to use? Any pointers you have on how I might manage to build this would be useful. I'm currently eyeing the rocBLAS derivation as a potentially good starting point. Edit: no longer a priority for me |
pytorch now fails to build after 5 -> 6 transition, because it depends on miopengemm which was removed. |
I edited the description to add an entry for rocblaslt. It's, apparently, a dependency for zluda |
Apparently pytorch now requires
|
As per pytorch/pytorch#119081 (comment) in 2.4.0+ (future release) it should be possible to use something like: pythonPackagesExtensions = prev.pythonPackagesExtensions ++ [
(python-final: python-prev: {
torch = python-prev.torch.overrideDerivation (oldAttrs: {
TORCH_BLAS_PREFER_HIPBLASLT = 0; # not yet in nixpkgs
});
})
]; |
@ony , TORCH_BLAS_PREFER_HIPBLASLT is environment variable for runtime; pytorch still links and requires hipblaslt, even when unused. pytorch/pytorch#120551 should help, but I have no idea whether and when it could be accepted. By the way, hipblaslt is not difficult to build. Just don't build 6.0 release, skip directly to 6.1. When I tried, bundled TensileLine in 6.0 generated wall of unreadable errors, while 6.1 worked from first attempt. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/testing-gpu-compute-on-amd-apu-nixos/47060/4 |
I'm not able to build rocmlir-rock-6.0.2, when trying to install zluda.
Is there an easy fix for it? |
@DerDennisOP , it was addressed in pull-request ROCm/rocMLIR#1640 (issue ROCm/rocMLIR#1620), you may want use it. |
@DerDennisOP @AngryLoki i think you'll actually also need ROCm/rocMLIR#1542 (closes ROCm/rocMLIR#1500). similar patch in a nearby file |
Is there a plan to patch these things in upstream too? As far as I can see, the hydro logs show the same error as @DerDennisOP. |
Right now I do not have the time to update ROCm, but I could help out as a reviewer. |
While I do not have that much experience with ROCm, I could try it. |
FWIW, I have a branch where I have tried updating things to 6.2.4. Unfortunately, I am seeing linking failures in the
I don't think any of those PRs are viable. Apart from the MRs themselves not building, the auto-updater generally doesn't seem to respect the fact that ROCM components expect to be upgraded in lock-step. |
FWIW, I have opened a draft MR to record the state of my attempt: #364423 |
I have a mix of 6.3 and 6.2 working here with pytorch nightly but in no state to upstream. Might be helpful for someone with more time trying to fix it in nixpkgs. |
Tracking issue for ROCm derivations.
Key
WIP
Ready
TODO
Merged
ROCm-related
Notes
nix-shell maintainers/scripts/update.nix --argstr commit true --argstr keep-going true --arg predicate '(path: pkg: builtins.elem (pkg.pname or null) [ "rocm-llvm-llvm" "rocm-core" "rocm-cmake" "rocm-thunk" "rocm-smi" "rocm-device-libs" "rocm-runtime" "rocm-comgr" "rocminfo" "clang-ocl" "rdc" "rocm-docs-core" "hip-common" "hipcc" "clr" "hipify" "rocprofiler" "roctracer" "rocgdb" "rocdbgapi" "rocr-debug-agent" "rocprim" "rocsparse" "rocthrust" "rocrand" "rocfft" "rccl" "hipcub" "hipsparse" "hipfort" "hipfft" "tensile" "rocblas" "rocsolver" "rocwmma" "rocalution" "rocmlir" "hipsolver" "hipblas" "miopengemm" "composable_kernel" "half" "miopen" "migraphx" "rpp-hip" "mivisionx-hip" "hsa-amd-aqlprofile-bin" ])'
Won't implement
strictDeps
for all derivationsThe text was updated successfully, but these errors were encountered: