Refactor align #1197

kain88-de · 2017-02-04T08:37:30Z

See #1136 old PR.

I'd appreciate some help in testing the new weights options I've introduced here.

Old text

It stills needs some checking and tests. I would like go get this into 0.16.0 though. The RMSD class has been refactored this cycle and I'm currently not sure if the bug has been present as well in 0.15.0

Changes made in this Pull Request:

use arbitrary weights in align.py and rms.py
rewrite RMSF to use AnalysisBase
fix centering bug in RMSD (should be done with chosen weights)

PR Checklist

Tests?
tests for new deprecated option
Docs?
CHANGELOG updated?
~~[ ] Issue raised/referenced?~~
add deprecation warnings for mass_weighted
updated functions depending on new kwarg
fix all mass_weighted in encore. Or at least raise an issue and fix in other PR
RMSF deprecation warnings for run

jbarnoud · 2017-02-10T13:00:46Z

The change in analysis.rms.RMSF is going to break old code where arguments were passed to the run method. The class needs a deprecation warning when it happens. One way of doing this would be to have a proxy run method that issues the deprecation warning if it receives more than self as an argument.

orbeckst

Question about making copies (old behaviour) vs changing user data in place (new) + docs.

orbeckst · 2017-02-11T17:50:06Z

package/MDAnalysis/analysis/rms.py

    # superposition only works if structures are centered
    if center or superposition:
        # make copies (do not change the user data!)
        # weights=None is equivalent to all weights 1
-        a = a - np.average(a, axis=0, weights=weights)
-        b = b - np.average(b, axis=0, weights=weights)
+        a -= np.average(a, axis=0, weights=weights)


This is now changing user data – is that intentional (I don't think so)? If so, the comments above must be changed.

You suggested to include this change ;). But I'll change it back I actually prefer for it to not change user data.

orbeckst · 2017-02-11T17:52:56Z

package/MDAnalysis/analysis/encore/similarity.py

-        Whether to perform mass-weighted covariance matrix estimation
-        (default is True).
-
+    weights : str/array_like, optional


We really need to get this into 0.16.0 – otherwise we have to do deprecations for ENCORE stuff right away.

yes I will have some time tomorrow

I mostly still want to add tests for custom weights. My current idea is to use the masses as custom weights and compare the outputs when I chose weights='mass' that should be sufficient.

orbeckst · 2017-02-11T17:54:19Z

package/MDAnalysis/analysis/rms.py

            write RSMD into file file :meth:`RMSD.save`
-        mass_weighted : bool (optional)
+        mass_weighted : bool (deprecated)
             do a mass-weighted RMSD fit


do a mass-weighted RMSD fit; deprecated keyword argument, use weights='mass' instead

There should also be some markup at the bottom such as

.. versionchanged:: 0.16.0 Flexible weighting scheme with new ``weights`` keyword. .. deprecated:: 0.16.0 Instead of ``mass_weighted=True`` use new ``weights='mass'``

They should now be added to all affected classes/functions

orbeckst · 2017-02-11T17:57:46Z

package/MDAnalysis/analysis/rms.py

-            verbose=None, quiet=None):
-        """Calculate RMSF of given atoms across a trajectory.
+class RMSF(AnalysisBase):
+    """Calculate RMSF of given atoms across a trajectory.


Please leave the original versionadded markup and add a new versionchanged ("Follows analysis API").

orbeckst · 2017-02-11T18:00:02Z

testsuite/MDAnalysisTests/analysis/test_align.py

@@ -104,9 +108,9 @@ def tearDown(self):
    def test_rmsd(self):
        self.universe.trajectory[0]  # ensure first frame
        bb = self.universe.select_atoms('backbone')
-        first_frame = bb.positions
+        first_frame = bb.positions.copy()


Is this needed because you're not making copies of user arrays anymore? (See comment above)

orbeckst · 2017-02-11T20:07:13Z

Oops. Sorry about that.

…

Am Feb 11, 2017 um 12:14 schrieb Max Linke ***@***.***>: You suggested to include this change ;). But I'll change it back I actually prefer for it to not change user data.

orbeckst · 2017-02-11T20:09:53Z

Another test would be to mask the full weights so that only CA weights remain one and all others are 0. Then compare to a non-weighted calculation where only the CA were selected.

…

-- Oliver Beckstein email: [email protected]

Am Feb 11, 2017 um 12:22 schrieb Max Linke ***@***.***>: I mostly still want to add tests for custom weights. My current idea is to use the masses as custom weights and compare the outputs when I chose weights='mass' that should be sufficient.

simplify weights handling

This new general weights kwarg can take an mass argument or arbitrary weights to allow new fitting algorithms.

This also fixes a centering bug. Previous the center of mass was used independent of the used weights. Now we center with the chosen weights. This means results might change!

The old results have been wrong due to a centering bug in the RMSD code

kain88-de · 2017-02-12T16:37:11Z

Another test would be to mask the full weights so that only CA weights remain one and all others are 0

Good one. I'll add this now and hope that our selection code doesn't have any bugs.

kain88-de · 2017-02-12T16:39:34Z

package/MDAnalysis/analysis/align.py

+        if mass_weighted:
+            weights = 'mass'
+
+    if weights == 'mass':


Btw this method leaves a nasty warning right now when weights is a numpy array since the equal operator tries a elementwise operation, this is the only comparison that will happen in the future. Anyone a idea how to nicely express the logic that tests for an array_like?

kain88-de · 2017-02-12T16:42:22Z

package/MDAnalysis/analysis/encore/confdistmatrix.py

+    selection : str, optional
+        use this selection
+    superimposition_selection : str, optional
+        TODO


@mtiberti @wouterboomsma do you have a good idea for the doc description of this parameter. There isn't one in develop right now

added one in aad74e4 - sorry I missed it before

kain88-de · 2017-02-12T16:43:57Z

package/MDAnalysis/analysis/encore/similarity.py

+        if len(weights) != len(ensembles):
+            raise ValueError("need weights for every ensemble")
+    else:
+        weights = [None for _ in range(len(ensembles))]


@mtiberti, @wouterboomsma is this correct to demand for hes that custom weights have to be specified for each ensemble?

Yes I think so - since ensembles are independent MDAnalysis.Universe they are not technically required to have the same topology, so it would make sense to specify a weight/mass array for each of them. Thanks for these changes!

kain88-de · 2017-02-12T16:44:50Z

package/MDAnalysis/analysis/rms.py

+            weights = self.atomgroup.masses
+        self.weights = weights
+
+    def run(self, start=None, stop=None, step=None, progout=None,


@jbarnoud This allows to use the class also in the old style and it prints a warnings.

This changes the methods in encore to the new arbitrary weights system. - fix docs and default parameters

This makes a comparison of tests when I chose mass as weights or directly the mass arrays from the atomgroup

kain88-de · 2017-02-12T17:22:33Z

Funny the coveralls report comes in before travis is done with the full run.

These warnings don't make sense since we haven't published any code with encore.

jbarnoud · 2017-02-13T08:21:07Z

package/MDAnalysis/analysis/rms.py

+            warnings.warn("run arguments are deprecated. Please pass them at "
+                          "class construction. These options will be removed in 0.17.0",
+                          category=DeprecationWarning)
+            if quiet is not None:


It would be better here to use mda.lib.log._set_verbose. That would do the same test as this, plus it would test the case where both quiet and verbose are provided. But most importantly, it will help latter on to find all the places to change when we will remove quiet.

jbarnoud · 2017-02-13T08:22:55Z

package/MDAnalysis/analysis/rms.py

+
+    def run(self, start=None, stop=None, step=None, progout=None,
+            verbose=None, quiet=None):
+        if np.any([el is not None for el in (start, stop, step, progout, quiet)]):


Out of curiosity, why np.any rather than just any?

I just didn't know about any

…ction" in conformational_distance_matrix

orbeckst · 2017-02-13T17:32:23Z

testsuite/MDAnalysisTests/analysis/test_align.py

-        #test mass_weighted
+        # test masses as weights
+        last_atoms_weight = self.universe.atoms.masses
+        A = self.universe.trajectory[0]


This gives Timesteps for A and B, with the side effect of moving the frame cursor.

Do you actually want to do

A = self.universe.trajectory[0].positions.copy()

???

I think the timestep is always a new one I think. But the timestep has a similar API as a atomgroup so it can work if the rmsd only looks at the positions.

orbeckst · 2017-02-13T17:32:51Z

testsuite/MDAnalysisTests/analysis/test_align.py

+        A = self.universe.trajectory[0]
+        B = self.reference.trajectory[-1]
+        rmsd = align.alignto(self.universe, self.reference, weights='mass')
+        rmsd_sup_weight = rms.rmsd(A, B,  weights=last_atoms_weight,


rmsd() can take Timestep?

Yeah. THe np.asarray call works on a timestep. I didn't know that!

orbeckst · 2017-02-13T17:39:30Z

Looking good... just curious about the use of Timestep instances in the tests (see comments). Or maybe I misunderstood what's happening. But the tests pass...

kain88-de · 2017-02-13T20:08:31Z

Anything else left to do?

orbeckst · 2017-02-13T22:48:42Z

I think the timestep is always a new one I think

No, if you were to do this with the same universe then A is B and you would superimpose the same coordinates. It only works here because these are different universes. It's fine for the tests. I just don't want people to get the idea that this is how you would normally do it.

orbeckst · 2017-02-13T22:49:12Z

Fine from my end.

…

Anything else left to do?

kain88-de added Component-Analysis help wanted refactoring labels Feb 4, 2017

kain88-de mentioned this pull request Feb 4, 2017

Refactor align #1136

Closed

7 tasks

kain88-de force-pushed the refactor-align branch from 937c523 to 4b8ea08 Compare February 4, 2017 08:39

orbeckst requested changes Feb 11, 2017

View reviewed changes

kain88-de added 10 commits February 12, 2017 16:58

refactor rms.py:rmsd

6e37376

simplify weights handling

Add weights kwarg to alignto

369877c

This new general weights kwarg can take an mass argument or arbitrary weights to allow new fitting algorithms.

introduce weights kwargs also for AlignTrj

adc82ea

RMSD uses new weights kwarg, fixes bug as well

8554752

This also fixes a centering bug. Previous the center of mass was used independent of the used weights. Now we center with the chosen weights. This means results might change!

correct test results!

70848c6

The old results have been wrong due to a centering bug in the RMSD code

convert RMSF to AnalysisBase

16444be

fix docs from using deprecated method

e361d63

add deprecation warnings

9103264

update CHANGELOG

eac64f3

add deprecation tests

9863cd4

kain88-de force-pushed the refactor-align branch from 4b8ea08 to 48d45d0 Compare February 12, 2017 16:35

kain88-de commented Feb 12, 2017

View reviewed changes

kain88-de added 5 commits February 12, 2017 17:46

Change encore to use arbitrary weights

60af3b2

This changes the methods in encore to the new arbitrary weights system. - fix docs and default parameters

add custom weights tests

feca58c

This makes a comparison of tests when I chose mass as weights or directly the mass arrays from the atomgroup

add deprecation notes

616e129

add dummy rmsf run method

2253984

add RMSF.run to deprecate it later

0d6496d

kain88-de force-pushed the refactor-align branch from 48d45d0 to 0d6496d Compare February 12, 2017 16:46

kain88-de added 3 commits February 12, 2017 18:25

fix numpy warning in weights arg check

af56e57

remove useless deprication warnings in confdist.py

705e384

These warnings don't make sense since we haven't published any code with encore.

add more tests for the weights

a2317fb

kain88-de removed the help wanted label Feb 12, 2017

jbarnoud reviewed Feb 13, 2017

View reviewed changes

kain88-de and others added 2 commits February 13, 2017 09:47

use uniform verbose/quiet arg handling

d516bc7

changed description for options "selection" and "superimposition_sele…

aad74e4

…ction" in conformational_distance_matrix

orbeckst reviewed Feb 13, 2017

View reviewed changes

orbeckst approved these changes Feb 13, 2017

View reviewed changes

orbeckst self-assigned this Feb 14, 2017

orbeckst added this to the 0.16.0 milestone Feb 14, 2017

orbeckst merged commit bb5d106 into develop Feb 14, 2017

orbeckst deleted the refactor-align branch February 14, 2017 01:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor align #1197

Refactor align #1197

kain88-de commented Feb 4, 2017 •

edited

Loading

jbarnoud commented Feb 10, 2017

orbeckst left a comment

orbeckst Feb 11, 2017

kain88-de Feb 11, 2017

orbeckst Feb 11, 2017

kain88-de Feb 11, 2017

kain88-de Feb 11, 2017

orbeckst Feb 11, 2017

orbeckst Feb 11, 2017

kain88-de Feb 12, 2017

orbeckst Feb 11, 2017

orbeckst Feb 11, 2017

orbeckst commented Feb 11, 2017 via email

orbeckst commented Feb 11, 2017 via email

kain88-de commented Feb 12, 2017

kain88-de Feb 12, 2017

kain88-de Feb 12, 2017

mtiberti Feb 13, 2017

kain88-de Feb 12, 2017

mtiberti Feb 13, 2017

kain88-de Feb 12, 2017

kain88-de commented Feb 12, 2017

jbarnoud Feb 13, 2017

jbarnoud Feb 13, 2017

kain88-de Feb 13, 2017

orbeckst Feb 13, 2017

kain88-de Feb 13, 2017

orbeckst Feb 13, 2017

kain88-de Feb 13, 2017

orbeckst commented Feb 13, 2017

kain88-de commented Feb 13, 2017

orbeckst commented Feb 13, 2017 via email

orbeckst commented Feb 13, 2017 via email

Refactor align #1197

Refactor align #1197

Conversation

kain88-de commented Feb 4, 2017 • edited Loading

PR Checklist

jbarnoud commented Feb 10, 2017

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Feb 11, 2017 via email

orbeckst commented Feb 11, 2017 via email

kain88-de commented Feb 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kain88-de commented Feb 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Feb 13, 2017

kain88-de commented Feb 13, 2017

orbeckst commented Feb 13, 2017 via email

orbeckst commented Feb 13, 2017 via email

kain88-de commented Feb 4, 2017 •

edited

Loading