Open
Description
We're currently using Int32 indices in some kernels, using the i32
hack, because that often results in significantly better performance. However, GPUs are getting large, and users are starting to use arrays that overflow typemax(Int32)
elements. This can results in bugs like #1963
We should be more careful about using 32-bit indexing, and probably not use i32
until we have a better way of deciding which index type to use. Maybe we can add some kind of index_type
trait, defaulting to Int
but possibly using Int32
when the input arrays allow it, e.g., using #1895.