Skip to content

BPA Ch.13: Incorrect occ_fs #99

Open
@mikemeredith

Description

@mikemeredith

(I'm a Stan newbie, but I have played with occupancy models a lot, including the WinBUGS and JAGS code in BPA.)

These are occupancy studies, and occ_fs is the number of sites in the sample which are occupied. As such, it cannot be less than the number of sites observed to be occupied (occ_obs) or greater than the total number of sites in the sample (R). The example code allows these constraints to be violated; it's just (bad!) luck that this doesn't happen with the example data from BPA.

A key idea is the conditional probability of occupancy, psi_con, conditional on the data. So if the species is not detected, we know the site is in the yellow area in the diagram below.
venn_bpa_ch13

Model site_occ.stan

The simple model has one value for psi and one for p, so all sites with no detections have the same value for psi_con, called psi_nd in the code below:

generated quantities {
  int<lower=occ_obs, upper=R> occ_fs;
  real psi_nd;            // prob present | not detected
  psi_nd = (psi * (1 - p)^T) / (psi * (1 - p)^T + (1 - psi));
  occ_fs = occ_obs + binomial_rng(R - occ_obs, psi_nd);
}

Model site_occ_cov.stan

Here each site has a different psi and p, and a good strategy is to calculate psi_con for each, which is in any case a result you may want to monitor.

generated quantities {
  int occ_fs;       // Number of occupied sites
  real psi_con[R];  // prob occupied | data
  int z[R];         // occupancy indicator, 0/1

  for (i in 1:R) {
    if (sum_y[i] == 0) {  // species not detected
      real      psi = inv_logit(logit_psi[i]);
      vector[T] q = inv_logit(-logit_p[i])';  // q = 1 - p
      real      qT = prod(q[]);
      psi_con[i] = (psi * qT) / (psi * qT + (1 - psi));
      z[i] = bernoulli_rng(psi_con[i]);
    } else {             // species detected at least once
      psi_con[i] = 1;
      z[i] = 1;
    }
  }
  occ_fs = sum(z);
}

Model bluebug.stan

The same strategy can be applied here, this time allowing for the variable number of visits to each site. So we replace

vector[T] q = inv_logit(-logit_p[i])';

with

vector[last[i]] q = inv_logit(-logit_p[i, 1:last[i]])';

Other models

I haven't looked closely at the dynamic models, but it appears that the reported z and n_occ values are also based on unconditional psi values. Of more concern are the closed-capture models in Ch.6, where the analogous value, the population size N, appears to have the same problem; this has a smaller numerical impact, as the denominator of the conditional inclusion is always large (ie, close to 1), but should be fixed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions