Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the cmake workflow when including it in an external project. #280

Closed
neoblizz opened this issue Feb 6, 2023 · 10 comments
Closed

Comments

@neoblizz
Copy link
Member

neoblizz commented Feb 6, 2023

What is the expected behavior

I wrote a cmake fetch content script to fetch a version of rocSparse through GitHub instead of relying on the pre-installed version on your system (through rocm-lib). The script looks like the following (FetchROCSparse.cmake):

include(FetchContent)
set(FETCHCONTENT_QUIET ON)

message(STATUS "Cloning External Project: Rocsparse")
get_filename_component(FC_BASE "../externals"
                REALPATH BASE_DIR "${CMAKE_BINARY_DIR}")
set(FETCHCONTENT_BASE_DIR ${FC_BASE})

set(CMAKE_CXX_COMPILER_ID "Clang")
set(BUILD_CLIENTS_SAMPLES OFF)

FetchContent_Declare(
    rocsparse
    GIT_REPOSITORY https://github.com/ROCmSoftwarePlatform/rocSPARSE.git
    GIT_TAG        rocm-5.4.2
)

FetchContent_GetProperties(rocsparse)
if(NOT rocsparse_POPULATED)
  FetchContent_Populate(
    rocsparse
  )
endif()
# Exposing rocsparse's source and include directory
set(ROCSPARSE_SOURCE_DIR "${rocsparse_SOURCE_DIR}")
set(ROCSPARSE_BUILD_DIR "${rocsparse_BINARY_DIR}")

# Add subdirectory ::rocsparse
add_subdirectory(${ROCSPARSE_SOURCE_DIR})

The output of this is:

-- Cloning External Project: Rocsparse
-- The Fortran compiler identification is Flang 99.99.1
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Check for working Fortran compiler: /opt/rocm/llvm/bin/flang - skipped
-- Using hip-clang to build for amdgpu backend
CMake Warning (dev) at externals/rocsparse-src/CMakeLists.txt:72 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'BUILD_CLIENTS_SAMPLES'.
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found Git: /usr/bin/git (found version "2.34.1")
-- Performing Test COMPILER_HAS_TARGET_ID_gfx803
-- Performing Test COMPILER_HAS_TARGET_ID_gfx803 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx900_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx900_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx906_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx906_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx908_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx908_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_on
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_on - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1030
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1100
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1102
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 - Failed
-- AMDGPU_TARGETS: gfx900;gfx906;gfx908;gfx90a;gfx1030
-- hip::amdhip64 is SHARED_LIBRARY
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Failed
-- hip::amdhip64 is SHARED_LIBRARY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success
-- Backward Compatible Sym Link Created for include directories

Notice how it is trying to add the following targets: AMDGPU_TARGETS: gfx900;gfx906;gfx908;gfx90a;gfx1030, the problem is exactly this. Even when I override the AMDGPU_TARGETS variable, it still tries to add all of those targets and fails compilation.

What actually happens

What I would like to do is just specify which target I want, for example:

set(CMAKE_HIP_ARCHITECTURES gfx908)
set(AMDGPU_TARGETS ${CMAKE_HIP_ARCHITECTURES} FORCE)

Or better yet, why does it not automatically rely on CMAKE_HIP_ARCHITECTURES or CMAKE_CUDA_ARCHITECTURES?

How to reproduce

Please see the script above.

Environment

Everything is running 5.4.2 (ROCM/HIP/rocSparse).

Alternatively, if you can recommend a better approach to include rocSparse (build from source) in our external projects? Thank you, any help here is appreciated!

@YvanMokwinski
Copy link
Collaborator

Please use the install script install.sh from the rocSPARSE repo.
./install.sh --help will list the options of the script.

@neoblizz
Copy link
Member Author

neoblizz commented Feb 6, 2023

Thank you for the suggestion! I am looking for something that is entirely self-contained within the CMake file (fetch, build, link). Can you suggest a method for configuring that script within the CMake?

@YvanMokwinski
Copy link
Collaborator

https://cmake.org/cmake/help/latest/command/execute_process.html?highlight=execute_process

@cgmb
Copy link
Contributor

cgmb commented Feb 7, 2023

Support for add_subdirectory is on my wishlist, but there's a decent bit of work to do in order to support it across all the ROCm math and communication libraries.

-- Performing Test COMPILER_HAS_TARGET_ID_gfx803
-- Performing Test COMPILER_HAS_TARGET_ID_gfx803 - Failed

This concerns me. That looks like a rocm-cmake bug. Do you have the CMake error log for these failures?

-- AMDGPU_TARGETS: gfx900;gfx906;gfx908;gfx90a;gfx1030

Unfortunately, hip-config.cmake sets a selection of AMDGPU_TARGETS by default. That is where those values are coming from. In my opinion, that behaviour is misguided. Among other things, it breaks the architecture autodetection that is built into the compiler. There was a ticket raised about this, but it didn't end up going anywhere. I'll raise the discussion again, and maybe we can fix this for ROCm 6.0.

@neoblizz
Copy link
Member Author

neoblizz commented Feb 7, 2023

Unfortunately, hip-config.cmake sets a selection of AMDGPU_TARGETS by default. That is where those values are coming from. In my opinion, that behaviour is misguided. Among other things, it breaks the architecture autodetection that is built into the compiler. There was a ticket raised about this, but it didn't end up going anywhere. I'll raise the discussion again, and maybe we can fix this for ROCm 6.0.

Yeah, additionally, users would like the ability to override the default behavior, even if it is set to autodetect. I want to compile/dev on one machine and run on another, and know which architecture I want it compiled for (not autodetected or defaults). If I can use the CMake variable (CMAKE_HIP/CUDA_ARCHITECTURES) for all of this (which is what thrust and other libraries rely on) that will make sense the most since I don't think AMDGPU_TARGETS is the official CMake variable.

This concerns me. That looks like a rocm-cmake bug. Do you have the CMake error log for these failures?

Let me get that for you, happy to provide any logs if it means I can eventually use add_subdirectory() or FetchContent_MakeAvailable() (preferred). 😄

@neoblizz
Copy link
Member Author

neoblizz commented Feb 7, 2023

I hope this helps! @cgmb

Cmake Output

-- Defaulting to Release build type
-- loops HIP Platform: amd
-- loops HIP Architecture: gfx908
-- The CXX compiler identification is GNU 11.3.0
-- The C compiler identification is GNU 11.3.0
-- The HIP compiler identification is Clang 15.0.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting HIP compiler ABI info
-- Detecting HIP compiler ABI info - done
-- Check for working HIP compiler: /opt/rocm/llvm/bin/clang++ - skipped
-- Detecting HIP compile features
-- Detecting HIP compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- hip::amdhip64 is SHARED_LIBRARY
-- Cloning External Project: Thrust and CUB
-- Cloning External Project: CXXOPTS
-- Cloning External Project: rocSparse
-- The Fortran compiler identification is Flang 99.99.1
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Check for working Fortran compiler: /opt/rocm/llvm/bin/flang - skipped
-- Using hip-clang to build for amdgpu backend
CMake Warning (dev) at externals/rocsparse-src/CMakeLists.txt:72 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'BUILD_CLIENTS_SAMPLES'.
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found Git: /usr/bin/git (found version "2.34.1")
-- Performing Test COMPILER_HAS_TARGET_ID_gfx803
-- Performing Test COMPILER_HAS_TARGET_ID_gfx803 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx900_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx900_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx906_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx906_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx908_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx908_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_on
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_on - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1030
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1100
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1102
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 - Failed
-- AMDGPU_TARGETS: gfx900;gfx906;gfx908;gfx90a;gfx1030
-- hip::amdhip64 is SHARED_LIBRARY
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Failed
-- hip::amdhip64 is SHARED_LIBRARY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success
-- Backward Compatible Sym Link Created for include directories
-- Configuring done
-- Generating done
-- Build files have been written to: /home/username/repo/build

Make Output

...
[  2%] Building CXX object externals/rocsparse-src/library/CMakeFiles/rocsparse.dir/src/handle.cpp.o
c++: error: unrecognized command-line option ‘--offload-arch=gfx900’
c++: error: unrecognized command-line option ‘--offload-arch=gfx906’
c++: error: unrecognized command-line option ‘--offload-arch=gfx908’
c++: error: unrecognized command-line option ‘--offload-arch=gfx90a’
c++: error: unrecognized command-line option ‘--offload-arch=gfx1030’

Logs

CMakeError.log
CMakeOutput.log

@neoblizz
Copy link
Member Author

neoblizz commented Feb 7, 2023

In my efforts (and stubbornness) to get this working, if you comment out the AMDGPU_TARGETS set in sudo vim /opt/rocm/hip/lib/cmake/hip/hip-config.cmake file, the right user-defined AMDGPU_TARGETS gets detected (specified by me by overriding AMDGPU_TARGETS) and leads to a new error instead:

[  2%] Building CXX object externals/rocsparse-src/library/CMakeFiles/rocsparse.dir/src/handle.cpp.o
c++: error: language hip not recognized
c++: error: language hip not recognized
make[2]: *** [externals/rocsparse-src/library/CMakeFiles/rocsparse.dir/build.make:76: externals/rocsparse-src/library/CMakeFiles/rocsparse.dir/src/handle.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:240: externals/rocsparse-src/library/CMakeFiles/rocsparse.dir/all] Error 2

Which seems equally odd. 🖖

@neoblizz
Copy link
Member Author

neoblizz commented Feb 7, 2023

AND it works! If you set the host compiler correctly to CXX=hipcc, it compiled perfectly, and all the tests that previously failed passed. Now, I am curious as to why I have to do that. Is it because rocsparse uses .cpp extensions for files even with GPU code?

@cgmb
Copy link
Contributor

cgmb commented Feb 7, 2023

rocSPARSE predates HIP language support in CMake (which was added in 3.21.3). At the time, the only option was hipcc. We could change the library to build using the CMake language support, but that would greatly increase the minimum version of CMake required to build the library. I put together an example last year of how that sort of thing could be made a configurable option (ROCm/rocRAND#266), but I haven't had time to follow up on it.

The use of add_subdirectory is not explicitly supported, so you might run into issues if you try using it with multiple rocm libraries. In particular, global cmake properties like targets and cache variables might not be properly namespaced to prevent conflicts. Nevertheless, I'm happy to hear that it works for you. If you do run into any problems, feel free to follow up in this thread. @lawruble13 and I do want to make it into a supported build feature eventually.

One last question for you: when you say that the tests passed, do you mean that you are running ROCm GPU code on WSL? Or was it just that it compiled successfully?

@neoblizz
Copy link
Member Author

neoblizz commented Feb 7, 2023

Ah! That makes complete sense. Although I would argue that the current CMake version 3.5 is probably too old. 3.15 supposedly supports add_subdirectory for example. The rocRAND#266 proposal looks very useful in my workflow (thank you for sharing!)

I will keep you guys posted if things fail to work, but for now since I was only using rocSparse, it worked flawlessly. I am probably still not going to opt for this method as it required me changing a system (hip-config) file that is not part of my project.

One last question for you: when you say that the tests passed, do you mean that you are running ROCm GPU code on WSL? Or was it just that it compiled successfully?

I wish I could get it to run GPU code on WSL! I work for AMD---so, if you have some internal method, I promise I won't tell anyone. 🤫 For now, it is just compiled successfully on WSL and the tests that passed were the cmake COMPILER_HAS_TARGET_ID_* tests that previously failed (I run my code on a separate machine/ssh'ed):

-- Performing Test COMPILER_HAS_TARGET_ID_gfx803
-- Performing Test COMPILER_HAS_TARGET_ID_gfx803 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx900_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx900_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx906_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx906_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx908_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx908_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_off
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_off - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_on
-- Performing Test COMPILER_HAS_TARGET_ID_gfx90a_xnack_on - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1030
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1100
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 - Failed
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1102
-- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 - Failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants