-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: GPU complex reducer prod for empty lists #3235
fix: GPU complex reducer prod for empty lists #3235
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why this makes a difference. As I understand it, T
is float
when the kernel is applied to complex64 and T
is double
when the kernel is applied to complex128. C-style casting should do a conversion-cast to the appropriate type, regardless of whether the constant was initially int
or float
(and it should do it at compile-time).
root [0] (float)1.0f
(float) 1.00000f
root [1] (double)1.0f
(double) 1.0000000
root [2] (float)1
(float) 1.00000f
root [3] (double)1
(double) 1.0000000
But if it does make a difference, what test demonstrates that? Just adding a test that fails for the old code and passes for the new code would be convincing enough.
Please, check the test that was reported in #3214 - it does not fail for me anymore. However, I was thinking along the same lines: why it makes a difference? I did some more digging. It turns out that we have the same or similar issue with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(In our meeting, @ianna pointed out that one of the tests was failing without this fix. So we do have a test that is sensitive to it.)
All tests-cuda pass on my GPU as well, so now we've double-tested it.
I'll merge this because it was passing all CI tests before the set of required tests was partially updated. Partial because the new tests in #3217 haven't been run in a while, so they're not in the drop-down menu to add them until I run them again. That was taking a while, so I came over here to take care of this PR, without the other one being done first.
fixes issue #3214
The test failure is due to a mismatch between the expected and actual results in the assertion
cpt.assert_allclose
. The issue arises when comparing the results of theak.prod
operation on arrays converted to the CUDA backend (cuda_depth1
) and the original CPU backend (depth1
). Specifically, there's a discrepancy in the third element of the arrays (which is anEmptyArray
) being compared:0j
versus (1.4641 - 0j
).The product of no element is defined as 1.