Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strategy for adding more model parameters within the blocked Gibbs Sampling framework by adding more blocks #71

Open
gully opened this issue Jul 1, 2016 · 3 comments

Comments

@gully
Copy link
Collaborator

gully commented Jul 1, 2016

I had an idea while watching a video on Gibbs sampling (the bit beginning at roughly 1h09m): Maybe we can adapt the Gibbs sampler to deal with correlated parameters after all. I'm not sure how to implement it, but the main procedure would be something like this:

  1. Update ~6 Nuissance Parameters, holding everything else fixed
  2. Update ~6 traditional stellar parameters (à-la Czekala et al. 2015)
  3. new Update whatever new stellar model parameters you've added (e.g. veiling, starspots, binary, whatever)

My guess is that this three-level blocked Gibbs sampler would outperform a two-level Gibbs with the new stellar parameters included in block 2.

For example, I witnessed poor convergence when I attempted to fit a mixture model with ~8 stellar parameters (3 of which were strongly correlated) in block 2, as discussed in detail in Issue #35. That tension led me to a major departure from the Gibbs sampler, in which I run emcee to sample all 14 stellar+nuisance parameters simultaneously, but with the major limitation that I had to chunk by spectral order thereby deriving multiple independent sets of stellar posteriors for ~50+ spectral orders. The extension to the Gibbs framework described here, if it can be implemented, would return to the much better situation of having a single posterior that is consistent with all the data.

This seems obvious in hindsight, so it must be a good idea, right?

@gully
Copy link
Collaborator Author

gully commented Jul 1, 2016

It should be noted that the three-block procedure would take ~1.5x longer than the existing 2-block procedure, but the Metropolis-Hasting convergence time would be superior in the 3-block case.

It should also be noted that the three-block procedure would be roughly N_walkers/3 x faster than the emcee modification (described above), for the same number of spectral orders equal to the number of CPUs. In my case of n_orders ~ n_walkers ~ n_cpus = 40, this would naively be a 10x speedup, but there's still the non-negligible albeit ameliorated problem of tuning the steps for the Metropolis Hastings, so the emcee modification is not quite so terrible, especially in human-attention time; it's main demerit is yielding too many inferences that have to be combined with heuristics and the sensitivity to order-level systematics.

@iancze
Copy link
Collaborator

iancze commented Jul 5, 2016

I'm all for new implementations of the sampling procedure, especially if they can be done in a modular fashion. I'm a little unclear on how breaking the additional parameters (i.e., spot coverage fraction, T_eff_2, etc...) into a separate Gibbs step will help with the correlation. Since these parameters will still be correlated with the parameters in step #2, wouldn't it be faster to tune the M-H jump in step 2 to the correlations of the space? As you point out, in general this is a pain to do, since it requires first burning in some runs using a guess at the correlation, and then restarting the chain with a (noisy) estimate of the correlation made from the previous samples. Are you thinking that the third Gibbs step would ameliorate this issue?

@gully
Copy link
Collaborator Author

gully commented Jul 7, 2016

Are you thinking that the third Gibbs step would ameliorate this issue?

I assumed so based on watching the Iain Murray video segment in which he introduces Gibbs sampling as a way to ameliorate correlated samples. It's also mentioned in this Hogg Blog post:

http://hoggresearch.blogspot.com/2012/12/emcee-vs-gibbs.html

You're right that tuning the MH sampler should work, and it does for sampling normal singleton stars. The problem only arises in practice for starspot models where I observed acceptance fractions <<1%. I suppose iteration on the tuning parameters would eventually work, but again, in practice this was a pain and involved lots of human intervention. One alternative would be to make analytic affine transformation of variables-- we know the logOmegas will correlate in a relatively predictable way, for example. My hope was that tweaking the Gibbs sampler would avoid this step so that adding extensions to the model would be relatively effortless for the user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants