-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test on blake failed for a long time #832
Comments
Are those the performance tests (I can't see CDash while overseas)? I believe @jewatkins was monitoring these for some time but I'm not sure what happened. I agree about changing / deactivating the tests if they are going to fail. |
These are the same failing tests discussed in #712 and I think the same issue remains. We'd probably have to increase the tolerance to 1e-2 because the GPU tests might still give the same result. |
To be more general, I suspect at some point we should use test values that are mach-specific. In general, we can't expect solution to be the same across archs. Yes, we are using some tolerance, but unless nonlinear tolerances are ridicolously low, tiny residual might still mean not-so-tiny solution diffs (depending on pb conditioning). OTOH, a mach-specific baseline/test-value is supposed to give us always the same value (unless ranks are changed, or trilinos impl change, or some part of the code uses randomized stuff). |
In this particular case, it's odd to me that the result was the same on both blake and weaver and then suddenly differed on blake-only. But I'd be okay with machine specific tests since that is what E3SM does. At which point, we could tighten tolerances. We should decide what we would like to do for E3SM integration and follow suit. |
Right, with mach-specific expected values, we can be more strict, and be more robust against asnwer-changing mods. |
Following up on Irina's email, I noticed that some blake tests have failed for ages. This one, for instance, started failing on June 8th, and has failed ever since.
I looked at our PR history, but no PR was merged around that day. However, I think we can push to master, so someone might have pushed something straight to master. Also, I don't recall if there was a system upgrade/change around then.
The fail is in the response check:
and it's a relative change of 1.48e-3.
@mperego what are your thoughts?
The text was updated successfully, but these errors were encountered: