Code for risk networks: a blend of compartmental models, graphs, data assimilation and semi-supervised learning.
First download anaconda. Then
git clone https://github.com/dburov190/risk-networks.git
cd risk-networks
conda env create -f risknet.yml
conda activate risknet
If that throws an error (something like CondaValueError: prefix already exists
), then type
conda activate risknet
to activate your existing conda environment.
Then write (from the same directory, the root directory of this repository)
cd examples
python3 examples/simulate_simple_epidemic.py
If you're lucky, this might produce something like:
- cycler==0.10.0
- eon==1.1
- kiwisolver==1.2.0
- matplotlib==3.2.1
- networkx==2.4
- numpy==1.18.3
- pyparsing==2.4.7
- pytz==2019.3
- scipy==1.4.1
To add this project's environment to your conda installation and install all dependencies, type
conda env create -f risknet.yml
from the root of this repository.
This overview
-
Performs an simulation of an example epidemic
-
Defines a population, including:
- Age distribution,
- Age-depednent clinical statistics,
- Transmission rate.
-
Generates a time-averaged contact network.
-
Simulates an epidemic with a stochastic kinetic model.
-
Simulates an epidemic with a deterministic 'risk', or 'master equation' model.
-
-
Performs data assimilation over a one-day window for the same population, but a different scenario.
We first import the package's functionality:
# Utilities for generating random populations
from epiforecast.populations import populate_ages, sample_distribution, TransitionRates
from epiforecast.samplers import GammaSampler, AgeAwareBetaSampler
from epiforecast.contact_simulator import ContactSimulator, DiurnalMeanContactRate
# Kinetic model simulation tool
from epiforecast.kinetic_model_simulator import KineticModel
# Simulation tool for ensembles of master equation models
from epiforecast.risk_simulator import MasterEquationModelEnsemble
# Tools for data assimilation and performing observations of 'synthetic data'
# generated by the KineticModel
from epiforecast.data_assimilator import DataAssimilator
from epiforecast.observations import FullObservation
# Tools for simulating specific scenarios
from epiforecast.scenarios import random_infection, randomly_infected_ensemble
from epiforecast.scenarios import percent_infected_at_midnight_on_Tuesday
from epiforecast.scenarios import ensemble_transition_rates_at_midnight_on_Tuesday
from epiforecast.scenarios import ensemble_transmission_rates_at_midnight_on_Tuesday
An epidemic unfolds on a time-evolving contact network, in a population with a distribution of clinical and transmission properties.
First we define the population by the number of individuals:
population = 1000
and the age category of each individual,
age_distribution = [ 0.23, # 0-19 years
0.39, # 20-44 years
0.25, # 45-64 years
0.079 # 65-75 years
]
# 75 onwards
age_distribution.append(1 - sum(age_distribution))
ages = populate_ages(population, distribution=age_distribution)
In the above we define the 6 age categories 0-19, 20-44, 45-64, 65-74, 75->.
Next we define six 'clinical statistics'. Clinical statistics are individual properties that determine their recovery rate and risk of becoming hospitalized or dying, for example. The six clinical statistics are
latent_period
of infection (1/σ
)community_infection_period
over which infection persists in the 'community' (1/γ
),hospital_infection_period
over which infection persists in a hospital setting (1/γ′
),hospitalization_fraction
, the fraction of infected that become hospitalized (h
),community_mortality_fraction
, the mortality rate in the community (d
),hospital_mortality_fraction
, the mortality rate in a hospital setting (d′
).
We randomly generate clinical properties for our example population,
latent_periods = sample_distribution(GammaSampler(k=1.7, theta=2.0), population=population, minimum=2)
community_infection_periods = sample_distribution(GammaSampler(k=1.5, theta=2.0), population=population, minimum=1)
hospital_infection_periods = sample_distribution(GammaSampler(k=1.5, theta=3.0), population=population, minimum=1)
hospitalization_fraction = sample_distribution(AgeAwareBetaSampler(mean=[ 0.02, 0.17, 0.25, 0.35, 0.45], b=4), ages=ages)
community_mortality_fraction = sample_distribution(AgeAwareBetaSampler(mean=[0.001, 0.001, 0.005, 0.02, 0.05], b=4), ages=ages)
hospital_mortality_fraction = sample_distribution(AgeAwareBetaSampler(mean=[0.001, 0.001, 0.01, 0.04, 0.1], b=4), ages=ages)
AgeAwareBetaSampler
is a generic sampler of statistical distribution with a function beta_sampler.draw(age)
which generates clinical properties for each individual based on their age
class (see numpy.random.beta
for more information). minimum
is a random number to which gamma_sampler.draw()
or beta_sampler.draw(age)
is added.
We process the clinical data to determine transition rates between each epidemiological state,
transition_rates = TransitionRates(population,
latent_periods,
community_infection_periods,
hospital_infection_periods,
hospitalization_fraction,
community_mortality_fraction,
hospital_mortality_fraction)
The transition rates have units 1 / day
. There are six transition rates:
- Exposed -> Infected
- Infected -> Hospitalized
- Infected -> Resistant
- Hospitalized -> Resistant
- Infected -> Deceased
- Hospitalized -> Deceased
In general, the transmission rate is different for each pair of individuals, and is
therefore can be as large as population * (population - 1)
.
The transmission rate may depend on properties specific to each individual in each pair,
such as the amount of protective equipment each individual wears.
Here, we pick an arbitrary constant,
constant_transmission_rate = 0.1 # per average number of contacts per day
The transition_rates
and constant_transmission_rate
define the epidemiological
characteristics of the population.
edges = load_edges(os.path.join('..', 'data', 'networks', 'edge_list_SBM_1e3.txt'))
contact_network = nx.Graph()
contact_network.add_edges_from(edges)
population = len(contact_network)
Physical contact between people in realistic communities is rapidly evolving.
We average the contact time between individuals over a static_contacts_interval
,
over which, for the purposes of solving both the kinetic and master equations,
we assume that the graph of contacts is static:
day = 1.0 # We use time units of "day"s
static_contacts_interval = day / 4
On a graph, or 'network', individuals are nodes and contact times are the weighted edges
between them. We create a contact network averaged over static_contacts_interval
for a population of 1000 with diurnally-varying mean contact rate,
mean_contact_rate = DiurnalMeanContactRate(maximum=40, minimum=2)
contact_simulator = ContactSimulator(
n_contacts = nx.number_of_edges(contact_network),
mean_event_duration = 1 / 60 / 24, # 1 minute in units of days
mean_contact_rate = mean_contact_rate, # Diurnally-varying function of time
start_time = -3 / 24, # negative start time allows short 'spinup' of contacts
)
With a population, its clinical properties, a transmission rate, and a contact network,
we can now simulate an epidemic over the static_contacts_interval
.
We seed the epidemic by randomly infecting a small number of individuals,
initial_statuses = random_infection(population, infected=20)
A kinetic simulation is direct simulation of the stochastic evolution of an epidemic.
kinetic_model = KineticModel(contact_network = contact_network,
transition_rates = transition_rates,
community_transmission_rate = transmission_rate,
hospital_transmission_reduction = hospital_transmission_reduction)
# Set the current state of the kinetic model
kinetic_model.set_statuses(initial_statuses)
# Generate contacts at t=0.
contacts = contact_simulator.mean_contact_duration(stop_time=0.0)
# Simulate an epidemic over the static_contacts_interval
kinetic_model.set_mean_contact_duration(contacts)
statuses = kinetic_model.simulate(static_contacts_interval)
The mean field equations represent the average behavior of many stochastic epidemics. Alternatively, we can interpret the mean-field state as the 'probability' that each individual has a certain epidemiological state.
For data assimilation, we simulate an ensemble of master equation models. For this example, we conduct a forward run of a single master equation model.
master_model = MasterEquationModelEnsemble( ensemble_size = 1,
contact_network = contact_network,
transmission_rates = constant_transmission_rate,
state_transition_rates = transition_rates,
# can also define mean_field_closure here
)
# Set the master equation state to the same initial state as the kinetic model.
master_model.set_state(initial_state)
# Simulate an epidemic over the static_contacts_interval.
output = master_model.simulate(static_contacts_interval)
(We can then make a plot that compares the evolution of the epidemic in the two models, and paste it here.)
Here we demonstrate forward filter data assimilation over a single assimilation 'window',
on a specific epidemic scenario. Our experiment begins at midnight (hour 00) on Tuesday,
and proceeds for 1 day until hour 24 (midnight on Wednesday). We assimilate observations
every 6 hours. Recall that static_contacts_interval = day / 4
,
static_contacts_interval = day / 4
data_assimilation_window = 1 * day
intervals_per_window = int(data_assimilation_window / static_contacts_interval)
We begin the experiment by simulating the slow evolution of a contact network over one day:
network_generator = EvolvingContactNetworkGenerator(
population = population,
start_time_of_day = 0,
averaging_interval = static_contacts_interval,
transition_rates = transition_rates,
lambda_min = 5, # contacts per day during activity minimum
lambda_max = 22, # contacts per day during activity maximum
initial_fraction_of_active_edges = 0.034,
measurement_interval = 0.1,
mean_contact_duration = 10 / 60 / 24, # 10 minutes
diurnal_contacts_modulation = diurnal_contacts_modulation,
# **other_contact_network_generation_parameters?
)
static_contacts_times = []
contact_networks = []
# Generate 4 contact networks for hours 00-06, 06-12, 12-18, 18-24
for i in range(intervals_per_window):
contact_network = network_generator.generate_time_averaged_contact_network(start_time_of_day=i*static_contacts_interval)
contact_networks.append(contact_network)
static_contacts_times.append(i * static_contacts_interval)
# Initialize the kinetic model, using the initial contact network at hour 00
kinetic_model = KineticModel( contact_network = contact_networks[0],
transmission_rates = constant_transmission_rate,
state_transition_rates = transition_rates
)
# Instantiate an example epidemic state
state = midnight_on_Tuesday(kinetic_model)
# Set the current state of the kinetic model
kinetic_model.set_state(state)
synthetic_data = []
for i in range(intervals_per_window):
# Set the contact network for the kinetic model
kinetic_model.set_contact_network(contact_networks[i])
# Run the kinetic model forward for 6 hours
kinetic_model.simulate(static_contacts_interval)
# Record the output
synthetic_data.append(kinetic_model.state())
We now have 'data' from hours 06, 12, 18, and 24.
master_model = MasterEquationModelEnsemble( contact_network = contact_networks[0],
ensemble_size = 20,
transmission_rates = constant_transmission_rate,
state_transition_rates = transition_rates,
)
# Generate a joint distribution of states and transition rates for this example.
ensemble_states = randomly_infected_ensemble(percent_infected_at_midnight_on_Tuesday())
ensemble_transition_rates = ensemble_transition_rates_at_midnight_on_Tuesday()
ensemble_transmission_rates = ensemble_transmission_rates_at_midnight_on_Tuesday()
master_model.update_ensemble_states(ensemble_states)
master_model.update_ensemble_transition_rates(ensemble_transition_rates)
master_model.update_ensemble_transmission_rates(ensemble_transmission_rates)
# Initialize the data assimilation method
assimilator = DataAssimilator(observations = FullStateObservation(population),
errors = [],
transition_rates_to_update_str = ['latent_periods'],
transmission_rate_to_update_flag = False)
for i in range(intervals_per_window):
# Set the contact network for the ensemble of master equation models
master_model.set_contact_network(contact_networks[i])
# Run the master model ensemble forward for six hours
master_model.simulate(static_contacts_interval)
(new_transition_rates,
new_transmission_rates,
new_ensemble) = assimilator.update(master_model.ensemble,
synthetic_data,
master_model.transition_rates,
master_model.transmission_rates,
contact_network = network_generator.get_contact_networks())
# Update the master model ensemble and parameters
master_model.set_ensemble(new_ensemble)
master_model.set_transition_rates(new_transition_rates)
master_model.set_transmission_rates(new_transmission_rates)
This completes data assimilation over one 'assimilaton window'. To apply interventions, we adjust
the inputs to network_generator.generate_time_averaged_contact_network
, generate a new time series of
contact networks, and then simulate the evolution of the epidemic over a subsequent assimilation window.
(Now we should compare the ensemble trajectory and the trajectory of the synthetic data...)