-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rank-normalized ESS and other variants #22
Comments
Any thoughts on this @devmotion? |
As mentioned somewhere else (probably in the discussion in ParetoSmooth.jl?) it would be great to improve the implementation further. We already compute (only) the split version and it was designed to be modular, but only modulo the different algorithms to compute the autocorrelation terms. Design-wise it seems a bit easier and more modular if the orthogonal parts are kept separate. For instance, rank-normalization can be implemented and be performed with a separate function and if desired one can just call (For completeness: In contrast to the observation in the paper we observed that generally the FFT-based approach is the slowest, by a large margin and consistent with the observation in MCMCDiagnostics. One reason seems to be that usually only a few lags are needed until the autocorrelation terms are sufficiently small and computation can be stopped (using the same approach based on Geyer's recommendation as in the FFT case), another reason might be that FFTs were not implemented in Julia but performed with FFTW which might affect benchmarks (e.g., in StatsFuns the pure Julia implementations seem to be significantly faster than calling into Rmath). We did not notice any numerical issues of the other algorithms for the autocorrelation estimates compared with the FFT approach but, of course, maybe such problems exist and the FFT algorithm is more accurate, as stated by the authors.) |
I don't think that's true, so much as splitting measures something different from non-splitting. Splitting makes I think it's probably better as a default, although I've considered whether some other trend-based estimate of the stationarity would be better than splitting. I suspect directly trying to estimate an exponential trend and then using that to estimate the variance inflation would work better. |
I've locally worked on a prototype design that I think works well. A few notes. First, I propose we decouple I propose the following interface: # by default return rank-normalized
ess(x; kwargs...) = ess_bulk(x; kwargs...)
# ess for estimator f. by default split each chain into 2.
ess(f, x; nsplit::Int=2, method, kwargs...)
# bulk-ESS
ess_bulk(x; kwargs...) = ess(mean, rank_normalize(x); kwargs...)
# tail-ESS
ess_tail(x; kwargs...)
# convert x into a proxy expectand, i.e. one whose mean-ESS approximates the ESS of the estimator f
# this is used in the default ess(f, x)
as_expectand(f, x, sample_dims)
rhat(x; kwargs...) = max.(rhat_bulk(x; kwargs...), rhat_tail(x; kwargs...))
rhat_bulk(x; kwargs...) = rhat(mean, rank_normalize(x); kwargs...)
rhat_tail(x; kwargs...) = rhat(mean, rank_normalize(fold(x)); kwargs...)
rhat(f, x; nsplit::Int=2)
rank_normalize(x; dims)
fold(x; dims) = abs.(x .- median(x; dims)) This API is lightweight and flexible enough to cover all specialized ESS methods in ArviZ, posterior, and brms and those in the split-Rhat paper. Splitting is by default done but also easily disabled or increased, and a user can easily add in folding or rank-normalization with convenience functions. The recommended ESS variants from the split-Rhat paper not connected to specific expectations (bulk-ESS, tail-ESS, and the corresponding Rhats) are provided through convenience functions, so the user doesn't need to know which transformations to apply. It's possible there are estimators whose ESS is best approximated by combining mean-ESS for multiple proxy expectands (std-ESS was an example of this, but a better method is now used that only requires a single proxy, see issues linked in #39), so I propose we go with this API for now and rework it if necessary later. While we could support fancier splitting by passing a |
I think that sounds great. Feel free to send me links to any relevant branches in Slack -- I was thinking of implementing something like this myself. |
I performed some informal benchmarks, and for realistic chains, with The recommended R-hat is the maximum of bulk- and tail-R-hat, so it folds once and rank-normalizes and computes mean-Rhat twice, so it's about 200x the cost of mean-R-hat with the existing implementations. So in terms of speeding up the implementations, it makes sense to focus our efforts on
The latter point is a good argument for keeping standalone |
My (likely) final design is the following API. First there are the methods users are most likely to call, which should have the most complete documentation:
Then there are the less common methods, most of which are used by the above ones, which should be only lightly documented:
The idea is that calling |
Hmm, is there maybe a way to make this interface cleaner or more generalizable? e.g. |
I don't see the benefit of a Also Lastly, I do agree though that it's not ideal how many methods we have. e.g. for ArviZ and MCMCChains to completely extend our methods for their storage types, they would need to implement 10 methods (yikes!) By comparison, posterior's API is:
which is not too dissimilar to what I'm proposing here, and Python ArviZ's API is:
which is nice in its simplicity but requires each method be named. so e.g. |
I suppose we could adopt Python ArviZ's syntax for the base |
Here's my best idea for a keyword API that works for
Suggestions, @devmotion and @ParadaCarleton? |
This week I will update the API similar to my previous comment, so we can hopefully make the breaking release. |
Fixed by #72 |
ESS is defined in the context of a specific estimate. For example, our current ESS implementations are all in the context of estimating the mean, so they will have problems whenever the mean/variance is not finite, e.g. for the Cauchy distribution. Vehtari et al proposed several variants to the ESS/R-hat for diagnosing various problems that can manifest in posterior draws.
The different variants covered in the paper fall in the following categories
Splitting:
Pre-processing:
Quantity being estimated:
Some notes:
The current
AbstractESSMethod
approach doesn't give us the flexibility of specifying these variants. It also uses different types just more or less to specify differences in how the autocovariance is computed, whereas we can see there are more knobs a user might like to turn. Now would seem to be a good time to revisit this API.The text was updated successfully, but these errors were encountered: