Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Align GpuComplex to its size (AMReX-Codes#3691)
## Summary As discussed in AMReX-Codes#3677, this PR makes the alignment of `amrex::GpuComplex` stricter to allow for coalesced memory accesses of arrays of GpuComplex by nvidia GPUs such as A100. Note that this may break `reinterpret_cast` from an array allocated as `std::complex` to `amrex::GpuComplex`, but not the other way around. ## Additional background Typical allocators (malloc, amrex CArena) give memory aligned to 16 bytes and CUDA allocators aligned to 256 bytes, which is sufficient for `amrex::GpuComplex<double>`. ## Checklist The proposed changes: - [x] fix a bug or incorrect behavior in AMReX - [ ] add new capabilities to AMReX - [ ] changes answers in the test suite to more than roundoff level - [ ] are likely to significantly affect the results of downstream AMReX users - [ ] include documentation in the code and/or rst files, if appropriate
- Loading branch information