Description
(I'm a Stan newbie, but I have played with occupancy models a lot, including the WinBUGS and JAGS code in BPA.)
These are occupancy studies, and occ_fs
is the number of sites in the sample which are occupied. As such, it cannot be less than the number of sites observed to be occupied (occ_obs
) or greater than the total number of sites in the sample (R
). The example code allows these constraints to be violated; it's just (bad!) luck that this doesn't happen with the example data from BPA.
A key idea is the conditional probability of occupancy, psi_con
, conditional on the data. So if the species is not detected, we know the site is in the yellow area in the diagram below.
Model site_occ.stan
The simple model has one value for psi
and one for p
, so all sites with no detections have the same value for psi_con
, called psi_nd
in the code below:
generated quantities {
int<lower=occ_obs, upper=R> occ_fs;
real psi_nd; // prob present | not detected
psi_nd = (psi * (1 - p)^T) / (psi * (1 - p)^T + (1 - psi));
occ_fs = occ_obs + binomial_rng(R - occ_obs, psi_nd);
}
Model site_occ_cov.stan
Here each site has a different psi
and p
, and a good strategy is to calculate psi_con
for each, which is in any case a result you may want to monitor.
generated quantities {
int occ_fs; // Number of occupied sites
real psi_con[R]; // prob occupied | data
int z[R]; // occupancy indicator, 0/1
for (i in 1:R) {
if (sum_y[i] == 0) { // species not detected
real psi = inv_logit(logit_psi[i]);
vector[T] q = inv_logit(-logit_p[i])'; // q = 1 - p
real qT = prod(q[]);
psi_con[i] = (psi * qT) / (psi * qT + (1 - psi));
z[i] = bernoulli_rng(psi_con[i]);
} else { // species detected at least once
psi_con[i] = 1;
z[i] = 1;
}
}
occ_fs = sum(z);
}
Model bluebug.stan
The same strategy can be applied here, this time allowing for the variable number of visits to each site. So we replace
vector[T] q = inv_logit(-logit_p[i])';
with
vector[last[i]] q = inv_logit(-logit_p[i, 1:last[i]])';
Other models
I haven't looked closely at the dynamic models, but it appears that the reported z
and n_occ
values are also based on unconditional psi
values. Of more concern are the closed-capture models in Ch.6, where the analogous value, the population size N
, appears to have the same problem; this has a smaller numerical impact, as the denominator of the conditional inclusion is always large (ie, close to 1), but should be fixed.