-
Notifications
You must be signed in to change notification settings - Fork 4
SBC FAQ
The number of datasets N
, the number of posterior samples from each dataset M
affect SBC test results. M
is understood as the minimum targeted effective sample size for all parameters. Within SBC workflow, these choices are encoded with n_datasets
, iter_warmup
, iter_sampling
, thin_rank
arguments.
Example of diagnostics affected by the argument, thin_rank
(from 10 (default) to 40 to 50) shows the need for a more robust design regarding input arguments.
It is recommended that the posterior sample not having autocorrelation which can be adjusted in two ways. First is online; doubling the ESS with online inspection of ESS_tail
. during sampling and double the compute_results
level via thin_ranks
parameters.
This rests upon modeler's choice, but we observed lp__
and ess_tail
are most conservative diagnostics in most cases. As illustrated, lp__ tends to be most robust, giving nonNA ess
; higher ess_bulk
does not necessarily mean higher ess_tail
. Refer to this Rhat_ESS for the comparison of the two.
> fit$[[1]]
variable rhat ess_bulk ess_tail
lp__ 1.01 224 219
rate[1] 1.04 129 243
rate[2] 1.04 118 NA
rate[3] 1.05 120 NA
p21 1.01 308 340
p31 1.01 342 253
> fit$[[11]]
variable rhat ess_bulk ess_tail
lp__ 1.02 185 249
rate[1] 1.01 236 NA
rate[2] 1.01 229 226
rate[3] 1.01 210 200
p21 1.01 258 301
p31 1.01 369 NA
def: changing arguments until SBC test is passed which could understate the false positive.
Ongoing discussion on the SBC test can be found in this post.
When dealing with discrete parameter values, the samples may include ties. Eq. 1 and 2 from this paper give two ways to "smooth" the ranked samples back to a discrete uniform distribution with 1 + ceiling(M / thin_rank)
possible ranks. Implemented in this package is the randomized rank smoothing defined in Eq. 1 of the article. In short, a prior sample theta
is assigned a random rank chosen uniformly between sum(posterior_theta < theta)
and sum(posterior_theta <= theta)
, where posterior_theta
is a sample from the posterior distribution of the parameter theta
conditioned on the data generated based on the prior draw theta
.
Numeric diagnostic which compares prior and posterior has three flavors:
- point to set metric
- set to set metric
- rank to uniform
- z and c score A z-score to contraction plot is an example of point to set comparison (Betancourt, 2018). z-score and c are defined as below and theta_tilde is a prior sample from the true model; z-score measures relative bias while c calculates posterior contraction for each prior sample.
This extends the previous metric and measures the distance between the prior sample and posterior samples as a whole. The figure below illustrates the difference between 1 and 2 where A
in the right figure means that prior samples go through data simulation and posterior simulation.
Distance metric between two discrete distributions D(sum_m(posterior_mn < prior_n)
, uniform
) is supported; such as pval
, max_diff
, Wasserstein
, Cumulative Jensen-Shannon divergence
.
Bias and dispersion are two main calibration targets for a given joint distribution simulator. From the four-quadrant in SBC_diff plot, one breach (outside the interval) might indicate bias whereas two breaches in (1,3) or (2,4) might allude dispersion.