Discarding initial samples to burn-in #38

jchodera · 2015-05-17T23:56:18Z

We probably want a scheme to automatically discard initial BHMM samples to burn-in. One way to do this would be to record the log-likelihood of the BHMM posterior and then use automated equilibration detection to discard initial samples to equilibrium.

franknoe · 2015-05-18T03:32:46Z

The posterior is unimodal, we start from the maximum likelihood, and for
every sample we generates a number of Gibbs steps before using it. I
would be surprised if this is an issue here. Probably we are making way
too many steps given the sizes of our matrices. But if there's an easy
way to check, why not.

Am 18/05/15 um 01:56 schrieb John Chodera:

We probably want a scheme to automatically discard initial BHMM
samples to burn-in. One way to do this would be to record the
log-likelihood of the BHMM posterior and then use automated
equilibration detection
https://github.com/choderalab/pymbar/blob/master/pymbar/timeseries.py#L710-L775
to discard initial samples to equilibrium.

—
Reply to this email directly or view it on GitHub
bhmm/bhmm#38.

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

jchodera · 2015-05-18T04:11:18Z

The posterior is unimodal,

Not necessarily. There's permutation symmetry if we don't enforce an ordering on the state means, and even then, I'm not certain it is unimodal.

we start from the maximum likelihood,

That doesn't mean we can get rid of burn-in---in fact, it means that we might be starting relatively far from a "typical sample" from the posterior if it is broad.

and for every sample we generates a number of Gibbs steps before using it.

This is certainly helpful, but we do the same thing in many MD simulations, and we still have to discard to burn-in or run a very long time.

I would be surprised if this is an issue here. Probably we are making way too many steps given the sizes of our matrices. But if there's an easy way to check, why not.

If we can just compute the log Bayesian posterior for each model, that would be an easily quantity to examine for the model timeseries!

franknoe · 2015-05-18T04:13:17Z

OK. Do you want to do this or should I look into it?

Does a decorrelated log Bayesian posterior mean that other observables
are decorrelated as well? I guess we can have quite different effective
decorrelation times in different observables.

Am 18/05/15 um 06:11 schrieb John Chodera:

The posterior is unimodal,
Not necessarily. There's permutation symmetry if we don't enforce an
ordering on the state means, and even then, I'm not certain it is
unimodal.
we start from the maximum likelihood,
That doesn't mean we can get rid of burn-in---in fact, it means that
we might be starting relatively far from a "typical sample" from the
posterior if it is broad.
and for every sample we generates a number of Gibbs steps before
using it.
This is certainly helpful, but we do the same thing in many MD
simulations, and we still have to discard to burn-in or run a very
long time.
I would be surprised if this is an issue here. Probably we are
making way too many steps given the sizes of our matrices. But if
there's an easy way to check, why not.
If we can just compute the log Bayesian posterior for each model, that
would be an easily quantity to examine for the model timeseries!

—
Reply to this email directly or view it on GitHub
bhmm/bhmm#38 (comment).

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

jchodera · 2015-05-18T04:21:58Z

OK. Do you want to do this or should I look into it?

I'm not quite sure where all the bits get calculated at this point, so if it is easier for you to compute the log-likelihood for the sampled BHMM models, I can focus on the burn-in analysis.

Does a decorrelated log Bayesian posterior mean that other observables are decorrelated as well? I guess we can have quite different effective decorrelation times in different observables.

Other observables can certainly have different correlation times, but the correlation time of the log posterior is certainly a lower bound on the slowest relaxation/mixing time of the BHMM sampler chain. It's what I would consider "due diligence" for sampling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discarding initial samples to burn-in #38

Discarding initial samples to burn-in #38

jchodera commented May 17, 2015

franknoe commented May 18, 2015

jchodera commented May 18, 2015

franknoe commented May 18, 2015

jchodera commented May 18, 2015

Discarding initial samples to burn-in #38

Discarding initial samples to burn-in #38

Comments

jchodera commented May 17, 2015

franknoe commented May 18, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

jchodera commented May 18, 2015

franknoe commented May 18, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

jchodera commented May 18, 2015