Replies: 1 comment
-
Unfortunately, BCOO is very slow: XLA is a non-sparse-aware compiler, and BCOO is a reference implementation of sparse operations within that dense compiler context; it was never designed to be particularly performant. If it's useful for you, you should use it. If it's not useful, you shouldn't. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a usecase where I have the log probabilities of many categorical distributions in a BCOO array. Now I want samples from each row with a different quantity.
The method that does the required operation is here:
A minimal code block that runs this is
Unfortunatly, this is very slow. The perfetto trace for instance states that the entire genexpr thing is taken 0.5s and that sum_duplicates alone is also taking 0.5s. Does anyone have an idea how to increase the speed of such an operation? I already tried vmap and jit, but they cannot wrap such a function, nor the slow parts of that function.
Beta Was this translation helpful? Give feedback.
All reactions