Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build failure: rocmPackages, rocmPackages_5 #305641

Closed
mschwaig opened this issue Apr 20, 2024 · 5 comments · Fixed by #306375
Closed

Build failure: rocmPackages, rocmPackages_5 #305641

mschwaig opened this issue Apr 20, 2024 · 5 comments · Fixed by #306375
Labels
0.kind: build failure A package fails to build

Comments

@mschwaig
Copy link
Member

mschwaig commented Apr 20, 2024

Steps To Reproduce

The build for a lot of ROCm components broke with 6083fb0, which updated CMake from 3.29.0 to 3.29.1, basically breaking all of ROCm.
This commit was merged via #298979 and recently made its way from staging into unstable in the last few days.

Steps to reproduce the behavior:

  1. nix build nixpkgs/716b484faececc3cd4570cc612ec9e927427d682#rocmPackages.rccl

Build log

The builds for a number of ROCm components now fail with error messages like the following:

CMake Error in library/CMakeLists.txt:
  Imported target "hip::device" includes non-existent path

    "/include"

  in its INTERFACE_INCLUDE_DIRECTORIES.  Possible reasons include:

  * The path was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and references files it does not
  provide.

Additional context

This is possible related to how ROCm uses CMAKE_INSTALL_<dir> of CMake incorrectly.
See #197838 (comment), ROCm/hipamd#55, and ROCm/rocm-cmake#121.

I have posted about this issue on the CMake discourse to verify if this is considered a bug in CMake, as suggested by @gracicot: https://discourse.cmake.org/t/possible-regression-related-to-interface-include-directories-in-2-29-1/10705

Notify maintainers

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
output here

Add a 👍 reaction to issues you find important.

@mschwaig mschwaig added the 0.kind: build failure A package fails to build label Apr 20, 2024
@mschwaig
Copy link
Member Author

The first recommendation we got from CMake was updating to 3.29.2, which has already been merged to staging:
#304111

@mschwaig
Copy link
Member Author

I was able to confirm that with that commit that updates CMake to 3.29.2 rocmPackages.rccl builds again.

I will leave this issue open until the relevant changes make it into unstable.

@jonas-w
Copy link
Contributor

jonas-w commented Apr 21, 2024

@mschwaig is there an easy way to temporarily use the cmake version from staging in my configuration for a specific package, without using staging for every dependency of that package?

@marsam
Copy link
Contributor

marsam commented Apr 21, 2024

A workaround could be to apply this as patch:

--- a/pkgs/development/rocm-modules/6/clr/default.nix
+++ b/pkgs/development/rocm-modules/6/clr/default.nix
@@ -111,6 +111,9 @@ in stdenv.mkDerivation (finalAttrs: {
       --replace "install(PROGRAMS \''${HIPCC_BIN_DIR}/hipcc.bat DESTINATION bin)" "" \
       --replace "install(PROGRAMS \''${HIPCC_BIN_DIR}/hipconfig.bat DESTINATION bin)" ""
 
+    substituteInPlace hipamd/hip-config-amd.cmake \
+      --replace-fail "\''${_IMPORT_PREFIX}/include" '${placeholder "out"}/include'
+
     substituteInPlace hipamd/src/hip_embed_pch.sh \
       --replace "\''$LLVM_DIR/bin/clang" "${clang}/bin/clang"
   '';

@mschwaig
Copy link
Member Author

One possible fix would be overriding the CMake version that is used inside of ROCm to update to the newer version, and merging that with a PR to master.
The advantage of this compared to #305664 would be that we do not need to worry about the fix actually breaking the build once the version update lands (I have not tested that this happens, but I could imagine it does).

It might be a good idea to do this by making the overridden version of cmake a part of rocm-cmake, and removing extra dependency on cmake elsewhere in ROCm.
Also ROCm 5.7 and 6.0 are both broken, so if we want to merge a fix before the CMake version update lands in unstable, we should fix both.

I am not working on a fix for this, and I couldn't merge one, but I'm happy to review PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: build failure A package fails to build
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants