Implement a regional SIR model with approximate inference #2466

fritzo · 2020-05-07T04:49:37Z

Addresses #2426

This implements a discrete state SIR model with interaction among multiple regions. I'm pretty happy with the user-facing model code, but the inference code suffers from inscrutable dimension complexity.

Approximation

Because exact enumeration has exponential cost in the number of regions, this PR instead uses a point estimate for cross region infections, namely the (-0.5, population+0.5)-bounded auxiliary variable as suggested by @martinjankowiak . I believe this estimate is unbiased except at the edges where there are small edge effects.

I'd like to test accuracy after this merges. This PR does introduce an interface to access the approximation, prev["I_approx"], so we can later change the approximation under the hood without breaking user-facing model code.

Tensor dimension bookkeeping

This PR required reordering dimensions to ensure proper broadcasting in user-facing model code. The new ordering is:

enum dims | particle dims | time | region

Previously enum dims were on the left for sequential enumeration but on the right for vectorized enumeration. Also included in this change is the new t=slice(None) value for time in the vectorized model (previously t was a tuple with Ellipsis and Nones).

Tested

unit testes for helpers
smoke tests
sanity check on an example:

$ python regional.py -r 3 -d 60 -f 30 --plot
Simulating from a RegionalSIRModel
Observed 395/691 infections:
1 0 1 1 0 1 0 1 2 1 2 1 2 1 3 1 3 3 0 1 1 1 2 1 1 1 2 2 1 1 4 4 3 4 7 3 5 10 10 10 8 5 4 8 4 12 8 7 5 4 7 10 5 6 7 5 13 3 6 7
INFO 	 Heuristically initializing...
INFO 	 Running inference...
Sample: 100%|███████████| 300/300 [04:10,  1.20it/s, step size=6.95e-02, acc. prob=0.926]

DEBUG reshaping auxiliary : torch.Size([200, 2, 60, 3]) -> torch.Size([200, 1, 2, 60, 3])

                       mean       std    median      5.0%     95.0%     n_eff     r_hat
            R0[0]      1.55      0.08      1.55      1.44      1.69     42.24      1.00
        rho_c1[0]      2.26      0.98      2.05      0.93      3.79      9.45      1.15
        rho_c0[0]      1.18      0.50      1.02      0.46      1.91      5.13      1.06
           rho[0]      0.85      0.03      0.85      0.81      0.90      4.01      1.69
           rho[1]      0.53      0.04      0.53      0.45      0.59     10.08      1.15
           rho[2]      0.91      0.05      0.92      0.85      0.98      3.21      2.47
 auxiliary[0,0,0]    997.98      0.56    998.01    997.06    998.80     33.82      1.08
 auxiliary[0,0,1]    999.69      0.37    999.78    999.11   1000.20      5.73      1.35
 auxiliary[0,0,2]    999.82      0.40    999.88    999.22   1000.45      7.89      1.03
 auxiliary[0,1,0]    998.00      0.70    998.10    996.56    998.82     13.63      1.08
 auxiliary[0,1,1]    999.29      0.41    999.27    998.72    999.99      3.41      1.59
 auxiliary[0,1,2]   1000.27      0.35   1000.43    999.75   1000.49      8.81      1.02
...

pyro/contrib/epidemiology/compartmental.py

fritzo · 2020-05-07T19:58:14Z

pyro/contrib/epidemiology/util.py

+@torch.no_grad()
+def align_samples(samples, model, particle_dim):
+    """
+    Unsqueeze stacked samples such that their particle dim all aligns.
+    This traces ``model`` to determine the ``event_dim`` of each site.
+    """
+    assert particle_dim < 0
+
+    sample = {name: value[0] for name, value in samples.items()}
+    with poutine.block(), poutine.trace() as tr, poutine.condition(data=sample):
+        model()
+
+    samples = samples.copy()
+    for name, value in samples.items():
+        event_dim = tr.trace.nodes[name]["fn"].event_dim
+        pad = event_dim - particle_dim - value.dim()
+        if pad < 0:
+            raise ValueError("Cannot align samples, try moving particle_dim left")
+        if pad > 0:
+            shape = value.shape[:1] + (1,) * pad + value.shape[1:]
+            print("DEBUG reshaping {} : {} -> {}".format(name, value.shape, shape))
+            samples[name] = value.reshape(shape)
+
+    return samples


@neerajprad I believe we can later refactor to make this generic, say as a new kwarg to MCMC.get_samples(), maybe align_samples (defaults to False) or particle_dim (defaults to None, must be negative).

That sounds reasonable. IIUC, we can put an outermost particle dim with size=1 for a model (if align_samples=True), collect samples with our usual flow and concatenate the collected samples at dim 0 which should be the same as sampling from a vectorized model with an outermost plate dim. Would that work?

I think it's safer to first collect samples and event_dim s and afterwards reshape. That way we wouldn't force users to write vectorizable / broadcastable code.

Sorry, I missed your comment. That sounds reasonable, but would sample reshaping be useful if the model wasn't vectorizable? Just want to make sure that I understand the use case.

Yes sample reshaping would still be useful even if the model is not vectorizable. For example in the CompartmentalModel in this PR, there are three models that are mathematically equivalent but with different computational complexity. We run HMC on one of those modes that is not vectorizable over particles (it is instead vectorized over time, hence its name _vectorized_model()). We then stitch together multiple samples and poutine.condition two other model that are vectorized over particles but are sequential over time (_sequential_model() and _generative_model()).

Thanks for explaining, this makes sense. I'll go over the models and your reshaping utility.

pyro/contrib/epidemiology/compartmental.py

martinjankowiak · 2020-05-07T22:58:41Z

pyro/contrib/epidemiology/sir.py

+
+        # Account for infections from all regions.
+        I_coupled = state["I"] @ self.coupling
+        pop_coupled = self.population @ self.coupling


shouldn't the coupling only operate on I?

I don't know, we should probably ask @lucymli what parameterization makes most sense. As this PR currently stands, coupling need not be normalized, and I have aimed for the following properties to hold (but I may be wrong 😖):

coupling = torch.ones(R, R) replicates the behavior of a single region of size population.sum().

If there is a single infectious individual among all regions, then the expected number of subsequent infections depends on R0 but not on coupling. This implies that the more time I spend infecting Oakland the less time I spend infecting San Francisco.

I guess an alternative parameterization is for a pairwise R0 matrix, but this seems less plausible to me. What other parameterizations or properties seem be sensible?

martinjankowiak

does align_samples need tests?

pyro/contrib/epidemiology/compartmental.py

pyro/contrib/epidemiology/sir.py

fritzo added 20 commits April 30, 2020 11:15

WIP implement unknown start time SIR model

bbbd486

Merge branch 'dev' into sir-truncated

43480e7

Add .predict() method

7a21fc5

Merge branch 'dev' into sir-truncated

d4cc9fa

Fix indexing logic

3f9c925

Add test for Index()[]

29825c8

Fix docs

3d73031

WIP sketch regional model

72c15a6

Address review comments

27b20d1

Merge branch 'sir-truncated' into sir-regional

5b4c2de

Fix typo

fad40cd

Merge branch 'sir-truncated' into sir-regional

b3751e6

Merge branch 'dev' into sir-regional

ccce15d

Merge branch 'dev' into sir-regional

4691214

WIP refactor CompartmentalModel

00fcdd7

WIP move enum dimensions to left

8aba67e

Order dimensions as EPTR and with aux as CTR

d1def5d

Fix bugs

0cd7a81

More fixes

f9a2275

Tweak docs

7b82a47

fritzo added enhancement WIP labels May 7, 2020

fritzo mentioned this pull request May 7, 2020

FR Discrete compartmental models for epidemiology #2426

Closed

38 tasks

fritzo commented May 7, 2020

View reviewed changes

pyro/contrib/epidemiology/compartmental.py Outdated Show resolved Hide resolved

fritzo added 3 commits May 7, 2020 09:32

Merge branch 'dev' into sir-regional

ed0d55f

Add __init__.py files to test dir to pacify pytest

94b8aa2

Refactor to use an align_samples() helper

b10fda9

fritzo commented May 7, 2020

View reviewed changes

Make one parameter heterogeneous

6e3388e

martinjankowiak reviewed May 7, 2020

View reviewed changes

Add simple example script

1488c69

fritzo added awaiting review and removed WIP labels May 8, 2020

fritzo requested a review from martinjankowiak May 8, 2020 00:36

fritzo added 4 commits May 7, 2020 17:49

Add --coupling command line option

1a3b62c

Add regional.py to test_examples.py

7403488

Merge branch 'dev' into sir-regional

6c79086

Expose approximation interface

e1fde53

martinjankowiak previously approved these changes May 9, 2020

View reviewed changes

pyro/contrib/epidemiology/compartmental.py Outdated Show resolved Hide resolved

pyro/contrib/epidemiology/sir.py Show resolved Hide resolved

Improve docs

bc03b28

fritzo dismissed martinjankowiak’s stale review via bc03b28 May 9, 2020 23:40

martinjankowiak previously approved these changes May 9, 2020

View reviewed changes

pyro/contrib/epidemiology/sir.py Show resolved Hide resolved

pyro/contrib/epidemiology/sir.py Outdated Show resolved Hide resolved

Fix typo

9354464

fritzo dismissed martinjankowiak’s stale review via 9354464 May 10, 2020 00:56

martinjankowiak approved these changes May 10, 2020

View reviewed changes

martinjankowiak merged commit e0f7053 into dev May 10, 2020

fritzo deleted the sir-regional branch June 5, 2020 15:31

neerajprad mentioned this pull request Jun 23, 2020

Allow arbitrary sample_sample in Predictive pyro-ppl/numpyro#639

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a regional SIR model with approximate inference #2466

Implement a regional SIR model with approximate inference #2466

fritzo commented May 7, 2020 •

edited

Loading

fritzo May 7, 2020 •

edited

Loading

neerajprad May 7, 2020

fritzo May 7, 2020

neerajprad May 13, 2020 •

edited

Loading

fritzo May 13, 2020

neerajprad May 13, 2020

martinjankowiak May 7, 2020

fritzo May 7, 2020 •

edited

Loading

martinjankowiak left a comment

Implement a regional SIR model with approximate inference #2466

Implement a regional SIR model with approximate inference #2466

Conversation

fritzo commented May 7, 2020 • edited Loading

Approximation

Tensor dimension bookkeeping

Tested

fritzo May 7, 2020 • edited Loading

Choose a reason for hiding this comment

neerajprad May 7, 2020

Choose a reason for hiding this comment

fritzo May 7, 2020

Choose a reason for hiding this comment

neerajprad May 13, 2020 • edited Loading

Choose a reason for hiding this comment

fritzo May 13, 2020

Choose a reason for hiding this comment

neerajprad May 13, 2020

Choose a reason for hiding this comment

martinjankowiak May 7, 2020

Choose a reason for hiding this comment

fritzo May 7, 2020 • edited Loading

Choose a reason for hiding this comment

martinjankowiak left a comment

Choose a reason for hiding this comment

fritzo commented May 7, 2020 •

edited

Loading

fritzo May 7, 2020 •

edited

Loading

neerajprad May 13, 2020 •

edited

Loading

fritzo May 7, 2020 •

edited

Loading