-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
corePDEs_SideSetLaplacian_3D failing on weaver after epetra removal #1030
Comments
@jewatkins. That's correct, this test was running with Epetra and never run with Tpetra. I briefly looked into it but I could not figure out the issue. If @mcarlson801 is willing to look into it, it would be great. |
This test is failing now in the OpenMP build as well: https://sems-cdash-son.sandia.gov/cdash//test/5973004 . It looks like the comparisons are failing as the response value computed is 0. The Cuda build is failing in the same way: https://sems-cdash-son.sandia.gov/cdash//test/5969262 |
I did a little bit of debugging on this and apparently the field ( |
That's odd. It should only happen at the first iteration, where the initial guess is 0. The NaN in the solver is likely preventing the solution from ever change, so it stays 0. I'm guessing the problem is that some entry of the Jac that should be 1 is actually kept at 0. Maybe something is amiss with the diagonal terms. I can dig quickly. |
It looks like this test passed in the OpenMP build last night. I just ran master on my workstation, and it works just fine. I don't see any relevant commit that went in yesterday, so I don't know what to make of this. As of now, CUDA is the only build that still shows the error. The fact that cuda consistently fails may suggest an issue with row/col gids for the side equation. Perhaps the same diagonal entry is set twice (once to 0 and the other time to 1), and depending on the order in which the two happen it can end up with the right or wrong value. I don't have time to do an in-depth debug today, and I leave for vacation on Friday, so feel free to disable the test until I get back if you feel like it. When I get back, I can debug some more. |
@bartgol : thanks for looking into this. If you check the history of the test in the camobap OpenMP build, it seems it fails for awhile, then passes for awhile: https://sems-cdash-son.sandia.gov/cdash//test/5973004?graph=status . This suggests a heisenbug, which is disturbing. I would suspect the openmp issue is the same as the cuda one, so if the coda one is fixed, hopefully things will be good for openmp as well. I'm fine with either keeping or disabling the test. I will make it a point not to open any more duplicate issues about this test :). |
https://sems-cdash-son.sandia.gov/cdash/test/4156291
It looks like weaver has more tests after #1028 was merged (111 -> 125) so maybe this case never ran before?
@mperego Is this a case that ran with epetra and never ran with tpetra? Were you planning to look into it? Anything obvious that shouldn't work on device? If not, @mcarlson801 can try to see what's going on.
The text was updated successfully, but these errors were encountered: