Open
Description
Hello,
I'm able to use this package with CUDA Arrays, since the allowed_getindex
(for example used here) does not handle the indexing error of the CuArrays. However this slow down the computation, since this process would be transfered and than executed on the CPU.
Is it possible, for example in that for loop I linked, to avoid using the allowed_getindex
function? It would be a very good improvement, not only for GPU calculations.
Metadata
Metadata
Assignees
Labels
No labels