Undefined references when using new Cuda-UVM flags #1025

mcarlson801 · 2024-01-26T16:13:15Z

It looks like we're now getting a bunch of undefined references on Weaver now that I've turned on the new Cuda UVM flags. The errors can be seen here: https://sems-cdash-son.sandia.gov/cdash/viewBuildError.php?buildid=56941

An example:

tmpxft_001e528f_00000000-6_Albany_Utils.cudafe1.cpp:(.text+0x164c): undefined reference to `Tpetra::MultiVector<double, int, long long, Tpetra::KokkosCompat::KokkosDeviceWrapperNode<Kokkos::Cuda, Kokkos::CudaSpace> >::putScalar(double const&)'

It looks like KokkosDeviceWrapper is looking for functions with CudaSpace template arguments instead of CudaUVMSpace.

I'm not sure exactly when this showed up since I forgot to update the weaver scripts when I updated them in the repo so we only started nightly testing of this today. It worked when I tested trilinos/Trilinos#12626 so this showed up sometime between now and then.

The text was updated successfully, but these errors were encountered:

mcarlson801 · 2024-01-26T20:37:24Z

Tagging @csiefer2 and @mperego for visibility

I reverted the weaver dashboard to the old flag for the time being so we don't lose coverage.

mperego · 2024-02-22T19:26:04Z

@cgcgcg FYI, we have seen these errors when enabling UVM with the new flags. Hopefully they will be caught by the new PR trilinos/Trilinos#12767.

cgcgcg · 2024-02-23T19:02:09Z

Are you sure Albany is correctly set up to use the Tpetra shared spaces flag? I see this

Albany/src/Albany_TpetraTypes.hpp

Lines 52 to 62 in 93c7e76

    
           typedef Tpetra::Map<Tpetra_LO, Tpetra_GO, KokkosNode>                 Tpetra_Map; 
        
           typedef Tpetra::Export<Tpetra_LO, Tpetra_GO, KokkosNode>              Tpetra_Export; 
        
           typedef Tpetra::Import<Tpetra_LO, Tpetra_GO, KokkosNode>              Tpetra_Import; 
        
           typedef Tpetra::CrsGraph<Tpetra_LO, Tpetra_GO, KokkosNode>            Tpetra_CrsGraph; 
        
           typedef Tpetra::CrsMatrix<ST, Tpetra_LO, Tpetra_GO, KokkosNode>       Tpetra_CrsMatrix; 
        
           typedef Tpetra::FECrsGraph<Tpetra_LO, Tpetra_GO, KokkosNode>          Tpetra_FECrsGraph; 
        
           typedef Tpetra::FECrsMatrix<ST, Tpetra_LO, Tpetra_GO, KokkosNode>     Tpetra_FECrsMatrix; 
        
           typedef Tpetra::RowMatrix<ST, Tpetra_LO, Tpetra_GO, KokkosNode>       Tpetra_RowMatrix; 
        
           typedef Tpetra::Operator<ST, Tpetra_LO, Tpetra_GO, KokkosNode>        Tpetra_Operator; 
        
           typedef Tpetra::Vector<ST, Tpetra_LO, Tpetra_GO, KokkosNode>          Tpetra_Vector; 
        
           typedef Tpetra::MultiVector<ST, Tpetra_LO, Tpetra_GO, KokkosNode>     Tpetra_MultiVector;

and this

Albany/src/Albany_KokkosTypes.hpp

Line 29 in 93c7e76

typedef Tpetra::KokkosCompat::KokkosDeviceWrapperNode<PHX::Device> KokkosNode;

but nothing in Phalanx is using the Tpetra shared space stuff.

mperego · 2024-03-27T16:28:38Z

@rppawlo Christian helped me understand a bit better this issue.
Our goal is to use Cuda UVM without using the deprecated -D Kokkos_ENABLE_CUDA:BOOL=ONoption.
Tpetra and Kokkos Kernels allow to use CUDA UVM by setting

-D KokkosKernels_INST_MEMSPACE_CUDAUVMSPACE=ON
-D Tpetra_ALLOCATE_IN_SHARED_SPACE=ON

However, Phalanx does not allow us to do so. At the moment, in Albany we use Phalanx::Device everywhere, and we end up having issues because our Tpetra vector and maps and matrices use Phalanx::Device, which doesn't use Cuda UVM, whereas Tpetra is expecting us to use Cuda UVM.
The most natural options for us would be for Phalanx to use Cuda UVM when Tpetra is, or, alternatively, to allow Albany to set a DeviceType for Phalanx (I know that you have worked on a branch that is supposed to do so, and it's probably the best solution for Trilinos).
Let us know what you think. This is not urgent, but we would like to have this addressed at some point and we can probably put some resources into it.

bartgol · 2024-03-27T21:01:51Z

I don't think we should make Phalanx look at what Tpetra uses, b/c PHX has no dependence on Tpetra.

Probably the cleanest solution would be to add a -DPhalanx_ENABLE_SHARED_SPACE=ON cmake option for phalanx, cmakedefine it in Phalanx_config.hpp.in, and then in Phalanx_KokkosDeviceTypes.hpp do

  using exec_space = PHX::Device::execution_space;
  using ExecSpace  = PHX::Device::execution_space;

#ifdef PHALANX_ENABLE_SHARED_SPACE
  using MemSpace   = Kokkos::SharedSpace;
  using mem_space  = Kokkos::SharedSpace;
#else
  using MemSpace   = PHX::Device::memory_space;
  using mem_space  = PHX::Device::memory_space;
#endif

It is up to the downstream app then to ensure that Tpetra and PHX use compatible spaces. If they so desire. There is no a-priori need to have PHX and Tpetra use the same mem space (though it is probably convenient).

rppawlo · 2024-03-27T22:01:48Z

@bartgol 's comment is probably the fastest way to accomplish uvm support. There would have to be some changes to the memory allocators as well (make sure phalanx uses MemSpace consistently and still allow users to override when needed). We do not want to introduce a dependency between phalanx and tpetra just to get a default type. There is a branch that makes the PHX::Device a true Kokkos::Device (my preferred long term solution), but that causes a mess of backwards incompatible changes. It severely impacts the downstream apps. The branch is in my personal fork of trilinos: rppawlo/phalanx-device-object-refactor if you want to take a look. I do have some refactoring tools as well that get you the 90% solution. We should have a quick meeting about this.

bartgol · 2024-03-27T22:12:20Z

I think that setting the Device would not solve the problem. The issue is the memory space that PHX uses. Without using the deprecated Kokkos_ENABLE_CUDA_UVM, the mem space of the Cuda device will always be CudaSpace. It is up to the kokkos customer (PHX in this case) to use Kokkos::SharedSpace as a memory space, rather than DeviceType::memory_space (and this can be done on a per-view basis as well!).

mperego · 2024-03-27T22:25:04Z

Sounds good. I didn't think of the fact that Phalanx doesn't depend on Tpetra

@bartgol The "true" Kokkos::Device, not the current PHX::Device, let you set both the execution space and the memory space, e.g. Kokkos::Device<Kokkos::Cuda,Kokkos::CudaUVMSpace>. But I might be misunderstanding what you are saying.

@rppawlo, sounds good, let's have a quick meeting when you have time.

bartgol · 2024-03-27T23:27:09Z

Ah, you're right. I confused Kokkos::Device with the exec space. Makes sense then.

This reverts commit 4ea3c76. These flags are not working and our weaver nightlies haven't been using them anyways (the script needs to be manually copied to project space). See #1025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Undefined references when using new Cuda-UVM flags #1025

Undefined references when using new Cuda-UVM flags #1025

mcarlson801 commented Jan 26, 2024

mcarlson801 commented Jan 26, 2024

mperego commented Feb 22, 2024

cgcgcg commented Feb 23, 2024

mperego commented Mar 27, 2024

bartgol commented Mar 27, 2024 •

edited

Loading

rppawlo commented Mar 27, 2024

bartgol commented Mar 27, 2024 •

edited

Loading

mperego commented Mar 27, 2024

bartgol commented Mar 27, 2024

Undefined references when using new Cuda-UVM flags #1025

Undefined references when using new Cuda-UVM flags #1025

Comments

mcarlson801 commented Jan 26, 2024

mcarlson801 commented Jan 26, 2024

mperego commented Feb 22, 2024

cgcgcg commented Feb 23, 2024

mperego commented Mar 27, 2024

bartgol commented Mar 27, 2024 • edited Loading

rppawlo commented Mar 27, 2024

bartgol commented Mar 27, 2024 • edited Loading

mperego commented Mar 27, 2024

bartgol commented Mar 27, 2024

bartgol commented Mar 27, 2024 •

edited

Loading

bartgol commented Mar 27, 2024 •

edited

Loading