-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is init_theta keyword working? #1588
Comments
Unfortunately, the documentation is largely outdated. It is called Turing.jl/src/inference/hmc.jl Line 148 in 602aa5f
Edit: Ah, I see you already noticed this. |
Did you keep the warmup phase in NUTS? Otherwise probably it is difficult to check if your sampler started from your initial setting. |
Regarding the implementation: it is actually not part of Turing.jl but defined in DynamicPPL.jl (https://github.com/TuringLang/DynamicPPL.jl/blob/9d4137eb33e83f34c484bf78f9a57f828b3c92a0/src/sampler.jl#L76). Maybe at some point it could be moved to AbstractMCMC.jl even. IIRC HMC just has to be handled in a special way since we set some default values if they are not provided by the user with the sampler, and therefore the keyword argument shows up there (and in the emcee sampler AFAIK) but not in any other sampler in Turing.jl. |
With NUTS, I first used a warmup-phase of 1, but even with a warm-up phase of 0, it still seems to ignore the supplied values. |
Can you add a print statement in Turing.jl/src/inference/hmc.jl Line 157 in 602aa5f
Turing.jl/src/inference/hmc.jl Lines 194 to 220 in 602aa5f
|
I did a |
Oh, I assume the problem is that It would make sense that the error does only occur with HMC since AFAIK IMO we should change the behaviour of |
Alternatively, one could always resample when using a sampler and perform evaluation only with a dedicated evaluation context (would require some refactoring along the lines of TuringLang/DynamicPPL.jl#80 probably). One has to be a bit careful though and probably special case the evaluation context for every sampler since depending on the sampler we store the log-likelihood, the log joint probability, or other things in |
My hypothesis in #1588 (comment) is correct, the problem is that |
Sounds like this is the same problem that @mateuszbaran had #1563. Someone was also asking about this in the slack channel a few weeks ago if I remember. |
Yes, that looks like the same problem. Unfortunately all I can tell is that commenting out that line didn't improve sampling from my model (but fixing one problem with my data and giving better prior distributions did help). |
I opened a PR with a temporary workaround in TuringLang/DynamicPPL.jl#232. |
…niform` (#232) This PR is a quick fix for TuringLang/Turing.jl#1563 and TuringLang/Turing.jl#1588. As explained in TuringLang/Turing.jl#1588 (comment), the problem is that currently `SampleFromUniform` always resamples variables in every run, and hence also initial parameters provided by users are resampled in https://github.com/TuringLang/DynamicPPL.jl/blob/9d4137eb33e83f34c484bf78f9a57f828b3c92a0/src/sampler.jl#L80. As mentioned in TuringLang/Turing.jl#1588 (comment), a better long term solution would be to fix this inconsistency and use dedicated evaluation and sampling contexts, as suggested in #80.
The fix is available in the latest release of DynamicPPL. It should be installed automatically if you update your packages. |
I've been trying to start my samplers in "better" regions of the parameter space, since it's a bit slow to converge otherwise on certain parameters. But from what I can tell, whenever I pass
init_theta
as per the docs, whatever values I have there are just ignored.MWE:
Here, I start the sampler in a far-away place. The MH proposals in
m1
are tiny, so it should be taking quite a while to get away from the region around (-10,-10); apparently, the starting values are just being ignored. I wasn't sure if it was just a MH issue, so I try HMC as well in the above example, and it appears to also ignore the supplied starting values.After digging around some more, I tried changing it to the keyword
init_params
instead ofinit_theta
. This works for the MH code above, but not for HMC (oddly enough, since the HMC sampling code is where I found theinit_params
keyword in the first place. I also tried NUTS, and didn't see any evidenceinit_params
was being respected there, either (although it's much harder to tell with an algorithm like NUTS, that can move away from starting values so quickly).So, it's not clear to me if this is just a case of the docs being in need of an update, or if there's something else wrong.
And in any case, if
init_theta
isn't a valid keyword anymore, perhaps supplying such a keyword should give an error or something? I just spent several days trying to debug code that I thought had some subtle identification error that was making my sampler "blow up" right at the start of sampling, when really it was justsample
silently ignoring starting values set by a deprecated(?) keyword.FWIW, being able to catch stuff like this easily is one reason I strongly favor recording the initial values of samplers as part of the chain itself (per the discussion in #1282), something I always do if writing samplers by hand.
The text was updated successfully, but these errors were encountered: