BPA Ch.13: Incorrect `occ_fs`

(I'm a Stan newbie, but I have played with occupancy models a lot, including the WinBUGS and JAGS code in BPA.)

These are occupancy studies, and `occ_fs` is the number of sites in the sample which are occupied. As such, it cannot be less than the number of sites observed to be occupied (`occ_obs`) or greater than the total number of sites in the sample (`R`). The example code allows these constraints to be violated; it's just (bad!) luck that this doesn't happen with the example data from BPA.

A key idea is the conditional probability of occupancy, `psi_con`, conditional on the data. So if the species is not detected, we know the site is in the yellow area in the diagram below.
![venn_bpa_ch13](https://cloud.githubusercontent.com/assets/1969451/24026108/e076def0-0af9-11e7-9ac6-4d82bee6329e.png)

## Model `site_occ.stan`

The simple model has one value for `psi` and one for `p`, so all sites with no detections have the same value for `psi_con`, called `psi_nd` in the code below:

    generated quantities {
      int<lower=occ_obs, upper=R> occ_fs;
      real psi_nd;            // prob present | not detected
      psi_nd = (psi * (1 - p)^T) / (psi * (1 - p)^T + (1 - psi));
      occ_fs = occ_obs + binomial_rng(R - occ_obs, psi_nd);
    }

##  Model `site_occ_cov.stan`

Here each site has a different `psi` and `p`, and a good strategy is to calculate `psi_con` for each, which is in any case a result you may want to monitor.

    generated quantities {
      int occ_fs;       // Number of occupied sites
      real psi_con[R];  // prob occupied | data
      int z[R];         // occupancy indicator, 0/1

      for (i in 1:R) {
        if (sum_y[i] == 0) {  // species not detected
          real      psi = inv_logit(logit_psi[i]);
          vector[T] q = inv_logit(-logit_p[i])';  // q = 1 - p
          real      qT = prod(q[]);
          psi_con[i] = (psi * qT) / (psi * qT + (1 - psi));
          z[i] = bernoulli_rng(psi_con[i]);
        } else {             // species detected at least once
          psi_con[i] = 1;
          z[i] = 1;
        }
      }
      occ_fs = sum(z);
    }

## Model `bluebug.stan`

The same strategy can be applied here, this time allowing for the variable number of visits to each site. So we replace

    vector[T] q = inv_logit(-logit_p[i])';

with

    vector[last[i]] q = inv_logit(-logit_p[i, 1:last[i]])';

## Other models

I haven't looked closely at the dynamic models, but it appears that the reported `z` and `n_occ` values are also based on unconditional `psi` values. Of more concern are the closed-capture models in Ch.6, where the analogous value, the population size `N`, appears to have the same problem; this has a smaller numerical impact, as the denominator of the conditional inclusion is always large (ie, close to 1), but should be fixed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BPA Ch.13: Incorrect `occ_fs` #99

Model `site_occ.stan`

Model `site_occ_cov.stan`

Model `bluebug.stan`

Other models

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BPA Ch.13: Incorrect occ_fs #99

Description

Model site_occ.stan

Model site_occ_cov.stan

Model bluebug.stan

Other models

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

BPA Ch.13: Incorrect `occ_fs` #99

Model `site_occ.stan`

Model `site_occ_cov.stan`

Model `bluebug.stan`