-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encore tests failing due to float precision #1168
Comments
A fancier way to solve the issue would be to have an intermediate layer on top of |
Yes, I missed this aspect when changing MemoryReader to always use float32. I guess it should be safe to convert the pureRMSD calculation to use float32 as well (at least we can try, and see how many of the encore tests fail). Ultimately, I agree that it would be nice to have support for both float64 and float32, but since float64 does not seem to be generally supported in MDAnalysis currently, it's probably not worth putting in the effort to put in an intermediate layer at this point. |
I would also like to have it float32. Otherwise the change of dtype will cause a copy of the array in memory and the computation might not succeed because we run out of memory. |
http://cython.readthedocs.io/en/latest/src/userguide/fusedtypes.html @jbarnoud if we want proper float32/float64 support this might be a way to do it |
This happens to be th exact page I am reading now ;)
…On 17-01-17 14:53, Richard Gowers wrote:
http://cython.readthedocs.io/en/latest/src/userguide/fusedtypes.html
@jbarnoud <https://github.com/jbarnoud> if we want proper
float32/float64 support this might be a way to do it
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1168 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUWujxYdjWnfMoyIQn1CzE1quu8ep3Iks5rTMf0gaJpZM4Lle21>.
|
Thanks for the comments - I agree it would be better to make it float32 as to be consistent with the rest, and then revise it at a later time if we need to introduce support for float64 as well. I'll get on it and let's see how it works. |
I investigated the problem a bit. The first branch is calling The second branch is visited when there are fir coordinates provided. There the coordinates can be float32 or float64 but they are explicitly casted to float64 before they are fed to Having everything homogenized to float32 (with explicit cast to float32 in both branched, and
Everything homogenized to float64 has every tests passing. I did not try the fused types, and I won't since @mtiberti stepped in 👍 |
Thanks @jbarnoud - I tried doing the same and got the same behaviour, and it also happens using the |
Small sidenote on the float32/float64 question. When @rbrtdlgd optimized the |
@orbeckst Isn't the solution to just make sure |
If this is proving tricky to fix, can we tag the failing tests as known failures so every PR doesn't have failing tests? |
I had a look yesterday but haven't come up with a solution yet - the differences in RMSD between calculating them with float32 and float64 are small as all of you suggested, but that's apparently enough to change the clustering outcomes in that specific case (all the other tests involving clustering pass). So for the moment I would suggest to change the expected output as the current so that at least other builds pass, and we can have a deeper look in the meanwhile if you think it's worth it. |
What clustering algorithm do you use? I would be interested in the difference you see if you are able to share the dataset. Also just sharing the code in a notebook would be nice. This is just for my own curiosity for numerical stability. |
Since casting the input for |
Sure, that's also a viable option. @kain88-de I'll get back to you - for now I've been switching manually between float32 and float64. We are using Affinity Propagation. |
Thanks @miterbi Also I'm for @jbarnoud solution to do a cast to float64 for now to fix the bug until we understand the numerical implications of using single precision. |
I'm mostly just going to steal @jbarnoud 's debugging work here, so we can talk about the failures in their own issue...
The value error seems unrelated to the numpy version, indeed. analysis.encore.cutils.PureRMSD expects float64_t coordinates but receives float ones. The first commit to fail the test is e356604 (see travis: https://travis-ci.org/MDAnalysis/mdanalysis/builds/192119548). The pull request (#1132) was passing the tests, but it got merged right after #1145 in 789a96c. The two pull requests are touching parts of ENCORE. My guess is that the MemoryReader is now working with float32 rather than float64 which breaks analysis.encore.cutils.PureRMSD. The simple solution would be to change analysis.encore.cutils.PureRMSD to use float32 instead of float64 but I would like to know what @kain88-de and @wouterboomsma think about it before changing this.
And my 2c, is that regardless of what
MemoryReader
is storing/supplying, there should probably be a check and/ornp.asarray
call somewhere to avoid this.The text was updated successfully, but these errors were encountered: