You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This can easily be fixed by using a tuned multi-dimensional launch configuration (which is what we do in ClimaCore), however, fusion still doesn't occur unless we "force" linear indexing. This imposes a few problems on us:
broadcasted index support seems to have changed in Julia 1.11 #1920 broke CI on Julia 1.11 ClimaCore.jl#1923 (need to open issue in julialang to better understand new required interface)
computing linear index offsets can only be done efficiently if we have datalayouts where the field index is (first or) last
using linear indexing is perfectly fine for pointwise kernels, but this could get pretty complicated for stencil kernels
I'm not sure what the implications are for our stencil kernels if we switch to field-ending datalayouts.
The baseline performance of the multidimensional array kernels is not good. Based on our CUDA benchmarks:
Our multi-dimensional array fusion is not improving over unfused kernels like our vector-fused kernels. Could be JuliaGPU/Metal.jl#101?
The text was updated successfully, but these errors were encountered: