Reference How to Access Global Memory Efficiently in CUDA Fortran Kernels C/C++ version: How to Access Global Memory Efficiently in CUDA C/C++ Kernels