diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/.nojekyll @@ -0,0 +1 @@ + diff --git a/404.html b/404.html new file mode 100644 index 00000000..69eed320 --- /dev/null +++ b/404.html @@ -0,0 +1,137 @@ + + +
+ + + + +YEAR: 2019 +COPYRIGHT HOLDER: Imperial College of Science, Technology and Medicine ++ +
Copyright (c) 2019 Imperial College of Science, Technology and Medicine
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+This package defines a set of useful primitives for structuring and +simulating individual based mechanistic models, with special emphasis on +infectious disease modelling. Structuring means that the +package lets you specify the state space of the model and transition +rules for how state changes over time, and simulating means +that the package defines a principled way to put state and transitions +together to update state over a time step. Transitions are any functions +that change state, and in “individual” come in two forms: processes, which occur each time step and events, which are scheduled. The simulation updates +over a discrete time step of unit length.
+This vignette provides a description of the various primitive
+elements provided by the “individual” package for modelling and explains
+how they work together. Please see the vignette("Tutorial")
+for a practical example.
In “individual”, variables are how one defines state, and represent +any attribute of an individual. While many variables will be dynamically +updated throughout a simulation, they don’t have to be. Specifying a +baseline characteristic for each individual that doesn’t change over a +simulation will still be specified using a variable object.
+There are 3 types of variable objects:
+individual::CategoricalVariable
can represent a set of
+mutually exclusive categories, individual::IntegerVariable
+for representing integers, and individual::DoubleVariable
+for continuous values (real numbers).
Most epidemiological models will require use of categorical
+variables to define state, for example, the Susceptible, Infectious, and
+Recovered classes in the classic SIR
+model. In general, for compartmental models, each set of
+compartments which describes an aspect of an individual should be mapped
+to a single individual::CategoricalVariable
. For example, a
+model which classifies individuals as in an SIR state and in a
+high or low risk groups could be represented with two categorical
+variable objects
There can be an unlimited number of categorical variable objects, but
+every individual can only take on a single value for each of them at any
+time. The allowable (categorical) state space for each individual is a
+contingency table where the margins are given by the values for each
+individual::CategoricalVariable
.
individual::CategoricalVariable
objects internally store
+state as a hash table of category values (with string keys), each of
+which contains a individual::Bitset
recording the
+individuals in that state, making operations on
+individual::CategoricalVariable
objects, or chains of
+operations extremely fast, and therefore should be the preferred
+variable type for discrete variables. However, cases in which discrete
+variables have a large number of values or when many values have few or
+no individuals in that category should use the
+individual::IntegerVariable
instead.
An individual::IntegerVariable
should be used for
+discrete variables that may either be technically unbounded, or whose
+set of possible values is so large that a
+individual::CategoricalVariable
is impractical.
+individual::IntegerVariable
objects internally stores state
+as a vector of integers of length equal to the number of individuals in
+the simulation, to be contrasted with the hash table and bitsets used in
+individual::CategoricalVariable
. Examples of this variable
+type might include household for a model that simulated transmission of
+disease between a community of households, or age group for an
+age-structured model.
A individual::DoubleVariable
can be used for continuous
+variables that are represented as double precision floating-point
+numbers (hence, “double”). Such variables are internally stored as a
+vector of doubles of length equal to the number of individuals in the
+simulation. Examples of this variable type might include levels of
+immune response or concentration of some environmental pollutant in the
+blood.
In “individual”, processes are functions which are called on each
+time step with a single argument, t
, the current time step
+of the simulation. Processes are how most of the dynamics of a model are
+specified, and are implemented as closures,
+which are functions that are able to access data not included in the
+list of function arguments.
An illustrative example can be found in the provided
+individual::bernoulli_process
, which moves individuals from
+one value (source) of a individual::CategoricalVariable
to
+another (destination) at a constant probability (such that the dwell
+time in the from
state follows a geometric
+distribution).
+bernoulli_process <- function(variable, from, to, rate) {
+ function(t) {
+ variable$queue_update(
+ to,
+ variable$get_index_of(from)$sample(rate)
+ )
+ }
+}
We can test empirically that dwell times do indeed follow a geometric
+distribution (and demonstrate uses of
+individual::IntegerVariable
objects) by modifying slightly
+the function to take int_variable
, an object of class
+individual::IntegerVariable
. Each time the returned
+function closure is called it randomly samples a subset of individuals
+to move to the destination state, and records the time step at which
+those individuals move in the individual::IntegerVariable
.
+We run the simple simulation loop until there are no more individuals
+left in the source state, and compare the dwell times in the source
+state to the results of stats::rgeom
, and see that they are
+from the same distribution, using stats::chisq.test
for
+comparison of discrete distributions.
+bernoulli_process_time <- function(variable, int_variable, from, to, rate) {
+ function(t) {
+ to_move <- variable$get_index_of(from)$sample(rate)
+ int_variable$queue_update(values = t, index = to_move)
+ variable$queue_update(to, to_move)
+ }
+}
+
+n <- 5e4
+state <- CategoricalVariable$new(c('S', 'I'), rep('S', n))
+time <- IntegerVariable$new(initial_values = rep(0, n))
+
+proc <- bernoulli_process_time(variable = state,int_variable = time,from = 'S',to = 'I',rate = 0.1)
+
+t <- 0
+while (state$get_size_of('S') > 0) {
+ proc(t = t)
+ state$.update()
+ time$.update()
+ t <- t + 1
+}
+
+times <- time$get_values()
+chisq.test(times, rgeom(n,prob = 0.1),simulate.p.value = T)
+#>
+#> Pearson's Chi-squared test with simulated p-value (based on 2000
+#> replicates)
+#>
+#> data: times and rgeom(n, prob = 0.1)
+#> X-squared = 6033.2, df = NA, p-value = 0.7346
By using individual::IntegerVariable
and
+individual::DoubleVariable
objects to modify the
+probability with which individuals move between states, as well as
+counters for recording previous state transitions and times, complex
+dynamics can be specified from simple building blocks. Several “prefab”
+functions are provided to make function closures corresponding to state
+transition processes common in epidemiological models.
During each time step we often want to record (render) certain output +from the simulation. Because rendering output occurs each time step, +they are called as other processes during the simulation loop (this is +another way we could have implemented the time counters in the previous +example).
+In “individual”, rendering output is handled by
+individual::Render
class objects, which build up the
+recorded data and can return a base::data.frame
object for
+plotting and analysis. A individual::Render
object is then
+stored in a function closure that is called each time step like other
+processes, and into which data may be recorded. Please note that the
+function closure must also store any variable objects whose values
+should be tracked, such as the prefab rendering process
+individual::categorical_count_renderer_process
, taking a
+individual::Render
object as the first argument and a
+individual::CategoricalVariable
object as the second.
+categorical_count_renderer_process <- function(renderer, variable, categories) {
+ function(t) {
+ for (c in categories) {
+ renderer$render(paste0(c, '_count'), variable$get_size_of(c), t)
+ }
+ }
+}
While technically nothing more than variables and processes is
+required to simulate complex models in “individual”, keeping track of
+auxiliary state, such as counters which track the amount of time until
+something happens (e.g., a non-geometric waiting time distribution) can
+quickly add unnecessary frustration and complexity. The
+individual::Event
class is a way to model events that don’t
+occur every time step, but are instead scheduled to fire at some future
+time after a delay. The scheduled events can be canceled, if something
+occurs first that should preempt them.
Events can be scheduled by processes; when an event fires, the event
+listener is called. The event listener is an arbitrary function that is
+called with the current time step t
as its sole argument.
+Listeners can execute any state changes, and can themselves schedule
+future events (not necessarily of the same type). Listeners are
+implemented as function closures, just like processes. Events can have any number of listeners
+attached to them which are called when the event fires.
Unlike individual::Event
objects, whose listeners are
+called with a single argument t
, when the listeners of
+individual::TargetedEvent
objects are called, two arguments
+are passed. The first is the current time step t
, and the
+second is a individual::Bitset
object giving the indies of
+individuals which will be affected by the event. Because it stores these
+objects internally to schedule future updates, initialization of
+individual::TargetedEvent
objects requires the total
+population size as an argument. When scheduling a
+individual::TargetedEvent
, a user provides a vector or
+individual::Bitset
of indicies for those individuals which
+will have the event scheduled. In addition, a user provides a delay
+which the event will occur after. The delay can either be a single
+value, so all individuals will experience the event after that delay, or
+a vector of values equal to the number of individuals who were
+scheduled. This allows, for example individual heterogeneity in
+distribution of waiting times for some event.
Much like regular Events, previously scheduled
+targeted events can be canceled, requiring either a
+individual::Bitset
or vector of indices for individuals
+whose events should be canceled.
The code to define events is typically quite brief, although
+typically another process must be defined that queues them. A recovery
+event in an SIR model for example, might look like this, where
+state
is a
+individual::CategoricalVariable
:
+recovery_event <- TargetedEvent$new(population_size = 100)
+recovery_event$add_listener(function(timestep, target) {
+ state$queue_update('R', target)
+})
Using a combination of variables, processes, and events, highly
+complex individual based models can be specified, and simulated with the
+function individual::simulation_loop
, which defines the
+order in which state updates are preformed.
It is possible to create conflicts between different processes, for +example, if a single individual is subject to a death process and +infection process, then both death and infection state changes could be +scheduled for that individual during a time step. When processes or +events create conflicting state updates, the last update will +take precedence. Another way to make sure conflicts do not exist is to +ensure only a single process is called for each state which samples each +individual’s update among all possible updates which could happen for +those individuals, so that conflicting updates cannot be scheduled for +individuals in a state.
++Simulation loop flowchart +
+The equivalent code used in the simulation loop is below.
+
+simulation_loop <- function(
+ variables = list(),
+ events = list(),
+ processes = list(),
+ timesteps
+ ) {
+ if (timesteps <= 0) {
+ stop('End timestep must be > 0')
+ }
+ for (t in seq_len(timesteps)) {
+ for (process in processes) {
+ execute_any_process(process, t)
+ }
+ for (event in events) {
+ event$.process()
+ }
+ for (variable in variables) {
+ variable$.update()
+ }
+ for (event in events) {
+ event$.tick()
+ }
+ }
+}
vignettes/Changing_Populations.Rmd
+ Changing_Populations.Rmd
This tutorial will present a toy model for simulating infectious +disease in populations which change in size. We will build on the SIR +model we created in the first tutorial to +show how to model a growing and shrinking population.
+As of individual v0.1.8, all variables and targeted events can be +extended and shrunk. Extending a variable requires the user to specify +new values to add to the end of the variable. This can be used to model +births in a population. Shrinking a variable requires the user to +specify which indices to remove, the indices of subsequent values will +be shifted downwards to replace the removed element. This can be used to +model deaths or displacement from a population.
+Targeted events can similarly be resized to match a changing
+population. You can simply add new individuals with
+$queue_extend
. If you would like to schedule the event for
+new individuals, you can use $queue_extend_with_schedule
+with a vector of delays to make sure the event is triggered for new
+individuals.
Resizing updates, shrinking and extending, are queued until after all +processes and event listeners are executed and are applied +after other updates. All the shrink updates for a variable or event are +combined and are applied before extend updates. This ensures that the +indices used to shrink a variable are the correct size for the variable. +After which, extension updates are applied in the order they were +called.
+From a performance perspective, shrinking updates should be used +sparingly, since the re-indexing of a variable can be more +computationally expensive than other updates. Consider applying +shrinking updates over longer intervals instead of every time step if +performance becomes an issue for your model.
+Let’s try to model a system with a constant birth rate and the +probability of death each time step is uniform for each individual.
+
+birth_rate <- 2
+death_rate <- .001
+
+birth_process <- function(t) {
+ n_births <- rpois(1, birth_rate / dt)
+ health$queue_extend(rep('S', n_births))
+ recovery_event$queue_extend(n_births)
+}
+
+death_process <- function(t) {
+ pop_size <- health$size()
+ deaths <- sample.int(pop_size, rbinom(1, pop_size, min(death_rate / dt, 1)))
+ health$queue_shrink(deaths)
+ recovery_event$queue_shrink(deaths)
+}
We can now add these processes to our previous model and plot the new +results.
+
+simulation_loop(
+ variables = list(health),
+ events = list(recovery_event),
+ processes = list(
+ infection_process,
+ recovery_process,
+ birth_process, # new process to introduce S into the population
+ death_process, # new process to randomly remove individuals from the populations
+ health_render_process
+ ),
+ timesteps = steps
+)
When modeling the impact of an intervention on a disease, it is +common to have a first simulation phase where the intervention is +disabled to achieve steady-state, followed by a second phase during +which the intervention is applied. Often, we want to run the second +phase many times over, varying the intervention parameters. Simulating +the first phase every time is unnecessary and wasteful, since it isn’t +affected by the intervention parameters.
+Individual allows the user to run a simulation for a number of time +steps, save the state of the simulation and resume it multiple times, +with different parameters each time. This way, the initial phase before +the intervention only needs to be simulated once.
+The typical way to use this feature is to define a simulation
+function which creates all the relevant simulation data and then calls
+simulation_loop
. The function we define takes in an
+optional state
parameter that is passed through to
+simulation_loop
.
+run_simulation <- function(timesteps, state=NULL) {
+ health <- CategoricalVariable$new(c("S", "I"), rep("S", 10))
+ process <- bernoulli_process(health, "S", "I", 0.01)
+ simulation_loop(
+ variables=list(health),
+ processes=list(process),
+ timesteps=timesteps,
+ state=state)
+}
The simulation can be run a first time, for a given number of steps. +It returns a state object, which captures the internal state of +all variables and events, the state of the random number generator and +the number of time steps that were simulated.
+
+state <- run_simulation(timesteps = 50)
Finally, the simulation is resumed with a larger number of time
+steps, passing in the state object as an argument. The
+timesteps
argument refers to the total number of time
+steps, including both the original simulation run and the new one. In
+this case, run_simulation
will only simulate 50 extra
+steps. Before running the actual simulation,
+simulation_loop
will reload the simulation state from its
+argument, overwriting any values we had set when initializing the
+variables.
+run_simulation(timesteps = 100, state=state)
To demonstrate the checkpoint and restore functionality of individual +in a practical setting, we will use a SIRS model with a vaccination +intervention. Our aim is to compare the impact of the vaccination +campaign, given different vaccine efficacy scenarios.
+Individuals in the simulation move from being susceptible (S) to
+infectious (I) to recovered (R) and back to susceptible, after their
+natural immunity wanes off. Out of the entire population N
,
+only I0
individuals are initially infectios, and the rest
+are susceptible. Orthogonally, an individual can either be vaccinated
+(Y) or not (N). The vaccination and the immunity it confers never wanes
+off. All individuals are initially unvaccinated.
+make_variables <- function(N, I0) {
+ health_states_t0 <- rep("S",N)
+ health_states_t0[sample.int(n = N,size = I0)] <- "I"
+ health <- CategoricalVariable$new(categories = c("S","I","R"), initial_values = health_states_t0)
+
+ vaccinated <- CategoricalVariable$new(categories = c("Y", "N"), initial_values = rep("N", N))
+
+ list(health=health, vaccinated=vaccinated)
+}
A vaccinated individual has a reduced probability of becoming +infectious, as determined by the vaccine’s efficacy. The function below +creates the process to model infection. It samples from the susceptible +compartments, applying the different rates depending on the whether an +individual’s vaccinated status.
+
+make_infection_process <- function(health, vaccinated, N, beta, vaccine_efficacy) {
+ function(t) {
+ I <- health$get_size_of("I")
+ foi <- beta * I / N
+
+ vaccinated_S <- health$get_index_of("S")$and(vaccinated$get_index_of("Y"))
+ non_vaccinated_S <- health$get_index_of("S")$and(vaccinated$get_index_of("N"))
+
+ vaccinated_S$sample(rate = foi * (1 - vaccine_efficacy))
+ non_vaccinated_S$sample(rate = foi)
+
+ health$queue_update(value = "I", index = vaccinated_S)
+ health$queue_update(value = "I", index = non_vaccinated_S)
+ }
+}
For a while at the start of the simulation no vaccination takes
+place. Only after a number of time steps, determined by the
+vaccination_start
parameter, does the intervention begin.
+Periodically, an event will fire and every individual has a fixed
+probability of becoming vaccinated. If vaccination_start
is
+NULL
, the intervention never begins.
+make_vaccination_event <- function(vaccinated, vaccination_start, vaccination_interval, vaccination_rate) {
+ e <- Event$new()
+ e$add_listener(function(t) {
+ vaccinated$queue_update(value = "Y",
+ vaccinated$get_index_of("N")$sample(vaccination_rate))
+ e$schedule(vaccination_interval)
+ })
+ if (!is.null(vaccination_start)) {
+ e$schedule(vaccination_start - 1)
+ }
+ e
+}
We will define our simulation as a function, taking the simulation
+parameters as arguments. Any additional arguments to the function, as
+denoted by ...
, will be passed on to
+simulation_loop
. This will allow us to pass the
+state
argument in. The function returns the simulation data
+as well as the new saved state.
+run_simulation <- function(
+ steps,
+ N = 1e3,
+ I0 = 5,
+ beta = 0.1, # S -> I
+ gamma = 0.05, # I -> R
+ xi = 0.03, # R -> S
+ vaccination_start = NULL,
+ vaccination_interval = 10,
+ vaccination_rate = 0.05, # N -> Y
+ vaccine_efficacy = 1,
+ ...)
+{
+ variables <- make_variables(N, I0)
+ infection_process <- make_infection_process(
+ variables$health,
+ variables$vaccinated,
+ N, beta, vaccine_efficacy)
+ recovery_process <- bernoulli_process(variables$health, "I", "R", gamma)
+ return_process <- bernoulli_process(variables$health, "R", "S", xi)
+ vaccination_event <- make_vaccination_event(
+ variables$vaccinated, vaccination_start, vaccination_interval, vaccination_rate)
+
+ renderer <- Render$new(timesteps = steps)
+ health_render_process <- categorical_count_renderer_process(
+ renderer = renderer,
+ variable = variables$health,
+ categories = variables$health$get_categories()
+ )
+
+ processes <- list(
+ infection_process,
+ recovery_process,
+ return_process,
+ health_render_process)
+
+ final_state <- simulation_loop(
+ variables = variables,
+ events = list(vaccination_event),
+ processes = processes,
+ timesteps = steps,
+ ...)
+
+ list(result=renderer$to_dataframe(), state=final_state)
+}
We will start by running and plotting our baseline simulation, with +the intervention disabled.
+
+data <- run_simulation(steps=1500)$result
+colours <- c("royalblue3","firebrick3","darkorchid3")
+matplot(
+ x=data["timestep"],
+ y=data[c("S_count","I_count", "R_count")],
+ xlab="Time", ylab="Count",
+ type="l", lwd=2, lty = 1, col = colours
+)
+legend(
+ x = "topright",
+ pch = rep(16,3),
+ col = colours,
+ legend = c("S", "I", "R"), cex = 1.5,
+ bg='white'
+)
We see that the simulation takes some time to settle from its initial
+parameters to its steady-state conditions. We will now enable the
+vaccine intervention, but only starting at a point after the simulation
+has settled, for example at t=500
.
+data <- run_simulation(steps=1500, vaccination_start = 500, vaccine_efficacy = 1)$result
+colours <- c("royalblue3","firebrick3","darkorchid3")
+matplot(
+ x=data["timestep"],
+ y=data[c("S_count","I_count", "R_count")],
+ xlab="Time", ylab="Count",
+ type="l", lwd=2, lty = 1, col = colours
+)
+legend(
+ x = "topright",
+ pch = rep(16,3),
+ col = colours,
+ legend = c("S", "I", "R"), cex = 1.5,
+ bg='white'
+)
The simulation above clearly shows the effect of the vaccination
+campaign, starting at t=500
. However, it made the
+optimistic assumption of a 100% vaccine efficacy. We wish to run the
+simulation again but with varying levels of efficacy, in order the
+compare its impact.
While we could run the code above many times over, each simulation
+would repeat the first 499 timesteps, despite the result being identical
+each time. Instead we start by running only these timesteps, and saving
+the result. We do need to specify the start of the intervention, as it
+is necessary to schedule the first vaccination event. However the
+details of the intervention (ie. vaccine_efficacy
) are
+irrelevant and can be omitted.
+initial <- run_simulation(steps=499, vaccination_start = 500)
From this initial result, we can resume the simulation, but using +different values of vaccine efficacy each time. We also include a +control simulation, in which no vaccination takes place. Each of these +simulation will skip the first 499 steps and only run the next 1001 time +steps.
+
+control <- run_simulation(steps=1500, vaccination_start = 500, vaccine_efficacy=0.0, state=initial$state)
+vaccine30 <- run_simulation(steps=1500, vaccination_start = 500, vaccine_efficacy=0.3, state=initial$state)
+vaccine50 <- run_simulation(steps=1500, vaccination_start = 500, vaccine_efficacy=0.5, state=initial$state)
+vaccine100 <- run_simulation(steps=1500, vaccination_start = 500, vaccine_efficacy=1.0, state=initial$state)
Finally we aggregate and plot the results from all these simulations. +We also need to include the data from our initial run, which we will +plot the same colour as our control simulation.
+
+colours <- c("royalblue3","firebrick3","darkorchid3", "seagreen3")
+
+# Pad initial out to ensure it has the same shape as other series.
+initial$result[500:1500,] <- NA
+matplot(
+ data.frame(
+ initial$result[,"I_count"],
+ vaccine30$result[,"I_count"],
+ vaccine50$result[,"I_count"],
+ vaccine100$result[,"I_count"],
+ control$result[,"I_count"]),
+ xlab = "Time", ylab = "Infected count",
+ type = "l", lwd = 1.5, lty = 1, col = colours,
+)
+legend(
+ x = "topright", pch = rep(16,3),
+ col = colours,
+ legend = c("Control", "30%", "50%", "100%"), cex = 1.5,
+ bg='white'
+)
Saving and restoring the simulation state comes with a number of +caveats.
+CategoricalVariable
cannot be
+modified, etc. The order of variables and events passed to the
+run_simulation
function must remain stable.vaccination_start
; changing that parameter would have no
+effect.While parameters of the simulation can be changed between the initial
+run and the subsequent runs (as demonstrated with the
+vaccine_efficacy
parameter above), in general you should
+not modify parameters that would have been already had an impact on the
+first part of the simulation. Doing so would produce results that can
+only be produced through checkpoint and resume, and not as a single
+simulation.
For example, in our SIRS model, it may be tempting to model a +time-varying parameter by running half of the simulation with one value +and then resuming it with a different value. While this would probably +work, it would be brittle and hard to compose. As more time-varying +parameters are introduced to the model, the simulation would need to be +saved and restored each time a value changes.
+By default resuming a simulation does not restore R’s random number +generator’s state. Every resumed run from the same saved state will be +independent and, if the model is stochastic, will produce different +results.
+We can demonstrate that by running the baseline of our SIRS model +multiple times and plotting the results. All three runs start off from +the same state, inherited from our original model run, but quickly +diverge based on the outcome of random draws.
+
+initial <- run_simulation(steps=499)
+run1 <- run_simulation(steps = 1500, state = initial$state)
+run2 <- run_simulation(steps = 1500, state = initial$state)
+run3 <- run_simulation(steps = 1500, state = initial$state)
+
+initial$result[500:1500,] <- NA
+matplot(
+ data.frame(
+ initial$result[,"I_count"],
+ run1$result[,"I_count"],
+ run2$result[,"I_count"],
+ run3$result[,"I_count"]),
+ xlab = "Time", ylab = "Infected count",
+ type = "l", lwd = 1.5, lty = 1, col = colours
+)
Sometimes this behaviour may not be desirable, and we would instead +like to restore the state of the random number generator exactly where +it was when we stopped the first part of the run. One example of this is +when checking that our model behaves the same whether or not it was +saved and resumed.
+The code below show an attempt at running the model twice, once as a +continous run and once in a piecewise manner. We would hope that seeding +the random generator at the start of the simulation would be enough to +get identical results out of it. Unfortunately we don’t, because the +random number generator state’s at the intermediate point isn’t being +preserved.
+
+set.seed(123)
+uninterrupted_run <- run_simulation(steps = 1500)$result
+
+set.seed(123)
+piecewise_run_initial <- run_simulation(steps = 499)
+piecewise_run_final <- run_simulation(steps = 1500, state = piecewise_run_initial$state)
+piecewise_run <- rbind(piecewise_run_initial$result, piecewise_run_final$result[500:1500,])
+
+all.equal(uninterrupted_run, piecewise_run)
+#> [1] "Component \"S_count\": Mean relative difference: 0.1454073"
+#> [2] "Component \"I_count\": Mean relative difference: 0.6820171"
+#> [3] "Component \"R_count\": Mean relative difference: 0.625746"
We can try the same again, but this time set
+restore_random_state = TRUE
to enable restoring the
+simulation state. This time we’ve successfully managed to reproduce the
+data from our uninterrupted run.
+set.seed(123)
+piecewise_run_initial <- run_simulation(steps = 499)
+piecewise_run_final <- run_simulation(steps = 1500, state = piecewise_run_initial$state, restore_random_state = TRUE)
+piecewise_run <- rbind(piecewise_run_initial$result, piecewise_run_final$result[500:1500,])
+
+all.equal(uninterrupted_run, piecewise_run)
+#> [1] "Component \"S_count\": Mean relative difference: 0.05500226"
+#> [2] "Component \"I_count\": Mean relative difference: 0.2852969"
+#> [3] "Component \"R_count\": Mean relative difference: 0.274895"
Using restore_random_state = TRUE
resets the global
+random number generator’s state, which could have surprising and
+undesirable side effects. It is generally useful in tests, but should be
+used carefully elsewhere.
+set.seed(123)
+print(.GlobalEnv$.Random.seed)
+#> [1] 10403 624 -983674937 643431772 1162448557 -959247990
+#> [7] -133913213 2107846888 370274761 -2022780170 -412390145 848182068
+#> [13] -266662747 -1309507294 1356997179 1661823040 1749531457 -516669426
+#> [19] 1042678071 -1279933428 -410084963 1151007674 -895613453 1288379032
+#> [25] -376044615 -1358274522 307686511 101447652 1796216213 -1567696558
+#> [31] 1186934955 -1925339152 -472470735 80319294 -1524429145 326645436
+#> [37] -389586803 -400786966 -890731933 -852332472 1365217705 -1785317034
+#> [43] -1551153185 1359863956 2098748037 -1013039742 -329721061 -1587358816
+#> [49] 344102689 -1520389522 166492183 1821136236 1646453629 1056605210
+#> [55] -1419044141 -806080008 520985497 711286406 2004844367 -1445006012
+#> [61] 1329781621 -1188844110 -1089068661 1173875536 -1983217903 514629022
+#> [67] -237421177 -258138084 -930078099 261626442 1349308227 -1125425240
+#> [73] -1677778551 25874358 409637567 -1987430924 1583257701 -136173086
+#> [79] 639501307 272101120 -1024630015 -1994369842 -939499785 -1944742196
+#> [85] -591520419 -1994900358 1072996275 1119025496 2035491705 -2082894618
+#> [91] 776176175 -69557596 1794806101 -178474478 -497581461 874372784
+#> [97] 518669041 -370223106 1295572071 -1776240260 -1692674995 1935534762
+#> [103] 298421283 111542024 -1075273367 518297110 -289321569 1331379028
+#> [109] 1768277573 1473660482 2120850651 879016544 -864018719 1661675310
+#> [115] 135902679 -2136373204 735594301 1594631386 -546138989 1423929528
+#> [121] -1067541671 1962863430 -1923418865 -984154108 1907308341 642901618
+#> [127] -1095019701 -1836613104 -1171392815 1663582814 -1258689721 -2007301412
+#> [133] -756910547 -712643830 -1271482109 -801485208 51646793 -1925477258
+#> [139] -1421379457 1104736436 -1348082651 -124611934 292791739 2126591424
+#> [145] -2043491647 -709285490 -1242530633 1688217996 -538353379 -1997652678
+#> [151] -48432781 575772696 942146361 57506214 -948054033 -72610460
+#> [157] 1389939989 656100050 -25586645 -2012424848 1854773937 1391516862
+#> [163] -2100008409 -140248004 -1638135795 -2077746326 -118729245 -1417654840
+#> [169] 662270249 942125782 -1363864737 744183316 2123821573 -80802046
+#> [175] -1753997669 1277518112 1090348705 1338137582 423408535 -28214548
+#> [181] 1164536573 1524008346 673959507 853634936 -1599644903 -2135083002
+#> [187] -345756977 -1070478652 971985653 -556736718 -406174453 663083216
+#> [193] 1258368657 1306568478 1820350727 -1068259940 -402617875 1499233226
+#> [199] -1121819965 -1773142744 1796260105 1463879990 901914175 104491892
+#> [205] 1605431269 -1933329566 1688405883 -446088064 1238889089 197049934
+#> [211] -709469577 -1072333748 1691378909 -1260585478 198644531 2053568216
+#> [217] 903127801 -1970919834 -473567825 1614821412 -1905604395 1082827666
+#> [223] 1558537707 1875545136 1518383729 -1265655426 -2085242905 1791098620
+#> [229] 1447558093 -1153758166 -99557469 -92185464 -2016280343 1847562134
+#> [235] 1495701791 -221368108 409722309 -429353022 1765302363 2137311200
+#> [241] -373658015 273043630 -350916265 -935055956 43404989 52012634
+#> [247] 1867958291 1488596536 -1347953959 174081222 2002460815 1429165444
+#> [253] -205312331 1264621554 -603785525 1270017936 -1543231919 -1282028578
+#> [259] 908887751 726075484 1269456301 -1680094070 -990917501 -1377014808
+#> [265] -1279971127 1281050102 228230143 1097770548 -1438663771 1295361058
+#> [271] 829172027 988808000 1704778305 804328206 -1257113545 -516583668
+#> [277] -1624037219 1034190522 904064243 -1716316776 1108935353 904106790
+#> [283] 1222361967 1146561252 1232110741 174767186 2136668075 -1843985680
+#> [289] 713263665 1133192766 1302119847 -499465796 -425742451 2035727594
+#> [295] 1324820835 -227988664 -1598926679 227290198 601218783 1836305300
+#> [301] 1386514821 306372738 -445226469 618852000 -25741791 156697966
+#> [307] -345772265 -2126405524 1998516861 -392853734 1588822483 1965665528
+#> [313] -1658840423 -1901588090 -687876529 -15753148 -1427453323 -1799286606
+#> [319] -47880053 97437264 -319365615 688369822 -272731001 469052188
+#> [325] 27259245 1573117258 -446761405 1976539816 2093047945 424297142
+#> [331] 1217440191 506831092 -1961736347 -1834464030 1234111227 907381248
+#> [337] -247365119 118499278 -1581033993 -893361716 -2100188067 335855482
+#> [343] 83920563 -1896483752 -323673479 -498745370 2088720687 -2102342236
+#> [349] 1873412181 226202898 -1483060885 1437743536 -430562831 -190616834
+#> [355] -1639345305 281953404 857940813 -549769814 -245419229 1375189512
+#> [361] -237346711 590186774 75687071 655107668 151057733 930998594
+#> [367] -1108466725 1398789472 1995685345 1605663278 1206398167 -1945513172
+#> [373] 1992513085 1544169434 1610742675 -152048712 -657450407 1247059526
+#> [379] 1880247311 -124605692 723920437 -1548596878 1827773003 479812880
+#> [385] 228152785 49698142 922100295 -1524757028 -845069011 534031882
+#> [391] -131080189 213485928 636833865 718143350 -1134260353 -2024842316
+#> [397] -1108831451 1977333154 1053535419 1301926080 -997856831 366738574
+#> [403] -1450544201 1064694924 -1016336355 -390217670 -1024466829 686789400
+#> [409] -2056715719 745319590 -999248145 -1240647580 -1395180523 -1837290030
+#> [415] -681354453 -514051984 1438153137 2090364862 -209968857 1765574460
+#> [421] -544057587 -844603798 -1693909789 -1746073400 -1156960215 2076419542
+#> [427] -1326601633 1784103188 -683597563 -824593918 1683989915 -509903840
+#> [433] 183502241 -132206866 -295556457 190629356 -1790739971 1849133210
+#> [439] -1660799661 214755960 -1837639143 975563526 1750237647 1014527428
+#> [445] 3490293 552878642 220695563 382907344 -1381266031 1445050910
+#> [451] 1771278343 -1719553892 862869741 583941834 -1759344189 1365915688
+#> [457] -820969463 -1381598154 -19516097 662427252 -1098735899 -812655006
+#> [463] 1658982011 -1203972224 1999245697 -1592487602 -1708699273 -1038727348
+#> [469] -725486627 747602170 2037447219 -161484328 469017081 1897421158
+#> [475] 644859055 959210276 1824012245 -1573943662 -797561621 466937648
+#> [481] 6984049 1344943230 -1963692313 507873788 1336756941 -446804182
+#> [487] -978024797 50927496 -66994199 -1542552938 -1630130145 1108679636
+#> [493] 421858501 286669506 176875355 1716904672 841747809 2002101166
+#> [499] -1936594857 -503678804 643784125 -270685862 -9162989 -1518294728
+#> [505] -1177069095 450623430 -1518307441 -2055143292 1977097653 1967586034
+#> [511] 2139569611 993708688 887981393 -146153762 -1521041977 -1948249252
+#> [517] 1992764589 1735430026 469169027 -492722456 1473540041 -1902921482
+#> [523] 1705351935 1769673012 -929011035 948225826 -946720709 1824431680
+#> [529] 1626208577 -1384520178 22671159 -1788782068 -359417955 272236986
+#> [535] -230435853 1174868120 -2145910343 -855063002 1748802159 651054564
+#> [541] -619908203 89300818 345161387 -1411621392 774662449 -1541883586
+#> [547] 1651670183 581520572 -1489764723 -2028142614 -1423847325 -1844713912
+#> [553] 1954615209 -389144746 66876895 2030417556 -361973627 -151813246
+#> [559] -1573918437 944703904 610784545 1108957294 -1875417577 -1297945748
+#> [565] 1037500797 1908181530 823650515 1875585016 -22111847 1765196934
+#> [571] -849597105 1315720004 -1748059787 -915770446 634433419 -1869504176
+#> [577] -887145199 2066662302 -939545721 -822528484 -1687437203 -1367629750
+#> [583] -1603461821 522180008 1610588041 2052437430 110280895 2014120948
+#> [589] -670960027 159018978 1050415611 568272128 -1718509311 -3409202
+#> [595] 753028343 -1139331892 -123651235 -2072165766 -1222087245 648343384
+#> [601] 1100161401 486404838 261566511 1504901284 -476745899 1151760402
+#> [607] -445050773 -130902864 -423755535 1831075326 934693479 690474876
+#> [613] -907644339 -744197974 1158732323 62223624 -1538777239 1455586326
+#> [619] -702514273 -1712778924 651699269 959548482 -586241317 1850142816
+#> [625] -647799583 2099891502
+print(runif(5))
+#> [1] 0.2875775 0.7883051 0.4089769 0.8830174 0.9404673
Thank you for taking the time to contribute to Individual. We’re +grateful that you would take some time to make Individual-Based Modeling +easier in R. Pull requests are very welcome.
+For major changes, please open an issue first to give other +developers a heads up on what you would like to add or change. That way +we can reduce any redundant efforts and give any resources if +needed.
+For bug reports please include:
+We use Git on this project. Which means we use master
,
+dev
, feat/*
, bug/*
,
+hotfix/*
and release/*
branches. Please refer
+to this
+post for more information of each type of branch. NOTE:
+bug/*
branches are feat/*
branches which fix a
+bug.
Practically speaking, all new code contributions should be
+feature branches. You should branch off of the dev
branch
+into one called feat/[your feature name here]
. When we
+consider pull requests from forked repositories to the
+mrc-ide/individual repository, we will expect this convention.
We periodically merge dev
into master
for
+small release updates. These releases will appear on the GitHub releases
+page. Please use Conventional
+Commits as it helps us version Individual properly.
Large releases will have a release/*
branch to manage
+the transition. No new features will be merged into
+release/*
branches. Only documentation and bug fixes will
+be considered.
R/simulation.R - Contains the main entry point and +configuration for models
+R/variables.R - Defines classes for the available +variables
+R/events.R - Defines classes for the available events
+src/ - The C++ side of the R interface and tests
+inst/include/Variable.h - The implementations of +Variables
+inst/include/Event.h - The implementations of Events
+tests/ - are divided into unit, integration and performance +tests. Integration tests are strongly recommended for large functions +and unit tests for everything else
+Here’s a checklist for a successful PR:
+These are the things we check for:
+Our review process is based off of RESIDE’s PR +review process
+We use google +benchmark for our microbenchmarks. You can compile and run the +benchmarks like this:
+cd tests/performance
+g++ *_benchmark.cpp -std=c++14 -lbenchmark -lpthread -o benchmark.out
+./benchmark.out
+Individual is designed for running big individual-based models. But +if you find your model taking too long or consuming all of your memory, +here are some things you can try to speed up your simulation.
+The bitset data
+structure is used to record the presence or absence of an element in a
+finite set. individual::Bitset
implements this data
+structure and is able to preform set operations extremely quickly using
+bitwise
+operations. Taking advantage of these operations can lead to very
+fast processes, when you need to find some particular subset of
+individuals.
Let’s take a look at the recovery process in
+vignette("Tutorial")
. A crucial operation here is to get
+all infectious individuals who are not already scheduled for recovery.
+The object I
is a bitset containing all those individuals
+currently infectious, and already_scheduled
is another
+bitset containing those individuals scheduled for a recovery event.
+Using already_scheduled$not()
returns a new bitset of those
+individuals not in the set of already scheduled persons. This
+is passed to the I$and()
, which modifies I
+in-place so that the result is the intersection of currently infectious
+persons and persons who have not yet been scheduled for a recovery
+event, which is precisely the set of people we want.
+recovery_process <- function(t){
+ I <- health$get_index_of("I")
+ already_scheduled <- recovery_event$get_scheduled()
+ I$and(already_scheduled$not())
+ rec_times <- rgeom(n = I$size(),prob = pexp(q = gamma * dt)) + 1
+ recovery_event$schedule(target = I,delay = rec_times)
+}
Bitsets can also be efficiently sampled using
+Bitset$sample()
. This is used in the infection process of
+vignette("Tutorial")
. Once the per-capita force of
+infection (probability of moving from S to I during this time step) is
+calculated, the bitset S
is sampled with that probability
+which modifies it in-place. The number of elements remaining after being
+sampled is binomially distributed. The argument rate
can
+also be specified as a vector of probabilities, one for each element in
+the bitset.
+infection_process <- function(t){
+ I <- health$get_size_of("I")
+ foi <- beta * I/N
+ S <- health$get_index_of("S")
+ S$sample(rate = pexp(q = foi * dt))
+ health$queue_update(value = "I",index = S)
+}
When creating a new Bitset, a user must specify the maximum size of
+the bitset. This is the maximum number of positive integers which the
+bitset can store. For example, if calling
+Bitset$new(size = 100)
, the resulting object is able to
+store the presence or absence of integers between 1 and 100 (inclusive).
+Attempting to insert or remove elements outside of this range will
+result in an error.
Bitsets offer methods to preform unions (Bitset$or()
),
+intersections (Bitset$and()
), symmetric set difference
+(also known as exclusive or, Bitset$xor()
), and set
+difference (Bitset$set_difference()
) with other bitsets.
+These methods modify the bitset in-place. The method
+Bitset$not()
gives the complement of a bitset, and returns
+a new individual::Bitset
object, leaving the original
+bitset intact. Because these set operations use bitwise operations
+directly rather than more expensive relational operators, computations
+with bitsets are extremely fast. Taking advantage of bitset operations
+can help make processes in “individual” much faster.
This can be seen when implementing a common pattern in
+epidemiological models: sampling success or failure for a bitset of
+individuals, and then generating two bitsets to hold individuals sampled
+one way or the other. A first method might use
+individual::filter_bitset
.
+n <- 1e4
+bset <- Bitset$new(n)$insert(1:n)
+probs <- runif(n)
+
+keep <- probs >= 0.5
+
+stay <- filter_bitset(bitset = bset,other = which(keep))
+leave <- filter_bitset(bitset = bset,other = which(!keep))
This pattern is almost always slower than using the sample method +with a set difference:
+
+stay <- bset$copy()
+stay$sample(rate = probs)
+
+leave <- bset$copy()$set_difference(stay)
In both instances the original bitset object bset
is not
+modified. The latter pattern can be made even faster if the original may
+be modified by directly taking the set difference with it. For models
+with large population sizes, the speed differences can be
+substantial.
Because a bitset stores integers in some finite set, it can be
+returned as an integer vector by using Bitset$to_vector()
.
+However, this is a slow and expensive operation, as data must be copied
+into a new vector which is returned to R. If your model’s dynamics
+require the frequent returning of integer vectors, an
+individual::IntegerVariable
object will be more
+appropriate. However, for most discrete variables, and especially those
+which mirror compartments in mathematical models, bitset operations and
+individual::CategoricalVariable
(which uses bitsets
+internally) should be preferred.
Every time your processes ask for a variable, there is an overhead +associated with moving simulation data into R, potentially incurring +expensive copying of data.
+Because many epidemiological models have similar state transitions,
+we’ve included several “prefab” processes and event listeners
+implemented in C++ which provide significant speed improvements and can
+be used out of the box in models. The functions return pointers which
+can be passed to the process list of
+individual::simulate_loop
or event listeners just like
+closures in R. The processes available are:
individual::bernoulli_process
: moves individuals from
+one categorical variable state to another at a constant probabilityindividual::multi_probability_bernoulli_process
: moves
+individuals from one categorical variable state to another at a
+individual level probability specified by a
+individual::DoubleVariable
objectindividual::fixed_probability_multinomial_process
:
+moves individuals from one categorical variable state to a set of
+possible destination values with constant probability to leave and
+multinomially distributed choice of destination state.individual::multi_probability_multinomial_process
:
+moves individuals from one categorical variable state to a set of
+possible destination values with individual level probability to leave
+specified by a individual::DoubleVariable
object and
+multinomially distributed choice of destination state.individual::infection_age_process
: Simulates infection
+for age-structured models, where individuals come into contact at a rate
+given by a mixing (contact) matrix.Prefabs for event listeners and renderers:
+individual::update_category_listener
: event listener
+for individual::TargetedEvent
objects which updates the
+categorical variable state when it fires.individual::reschedule_listener
: event listener for
+individual::TargetedEvent
objects which schedules some new
+followup event when it fires.individual::categorical_count_renderer_process
: used
+for individual::Render
objects that counts the size of each
+state in a categorical variable.Unfortunately, we don’t have a prefab for every situation. Please +feel free to write one of your own!
+These are the basic steps to add C++ processes to your R package:
+Run usethis::use_rcpp
to set your package up for C++
+development.
Add individual
to the LinkingTo
section
+of your package DESCRIPTION.
If you package is named mypackage
, create a header
+file containing #include<individual.h>
in any of
+these locations:
/mypackage_types.h
+ src/mypackage_types.hpp
+ src/include/mypackage_types.h
+ inst/include/mypackage_types.hpp inst
Then this header file will be automatically included in
+RcppExports.cpp
. For more information, see section “2.5
+Types in Generated Code” in the Rcpp
+Attributes vignette.
Create a file src/Makecars
containing the line
+CXX_STD = CXX14
. Because individual
uses C++14
+features, when compiling your package against it you must let the
+compiler know it should use the C++14 standard, otherwise it will not be
+able to compile.
Write your process!
Processes in C++ are of type process_t
, defined in
+inst/include/common_types.h
. Types for listeners for
+individual::Event
and
+individual::TargetedEvent
are listener_t
and
+targeted_listener_t
, defined in
+inst/include/Event.h
. Below is how the C++ implementation
+of multi_probability_bernoulli_process
is coded.
Note that the return type is a Rcpp::XPtr
(see
+here) to a process_t
, which is implemented as a
+std::function
(see
+here) object, a C++ class that can hold any callable type. The
+Rcpp::XPtr
is initialized with a pointer to a
+process_t
object, which itself holds a C++ lambda
+function, basically a function closure.
The lambda function captures the input arguments by value, and takes
+a single argument when called t
, giving the current time
+step (just like process functions in R). Sampling those individuals to
+change state is implemented with the C++ API for these objects.
#include <individual.h>
+#include <Rcpp.h>
+
+// [[Rcpp::export]]
+::XPtr<process_t> multi_probability_bernoulli_process_cpp(
+ Rcpp::XPtr<CategoricalVariable> variable,
+ Rcppconst std::string from,
+ const std::string to,
+ const Rcpp::XPtr<DoubleVariable> rate_variable
+ ){
+
+// make pointer to lambda function and return XPtr to R
+ return Rcpp::XPtr<process_t>(
+ new process_t([variable,rate_variable,from,to](size_t t){
+
+// sample leavers with their unique prob
+ individual_index_t leaving_individuals(variable->get_index_of(std::vector<std::string>{from}));
+ std::vector<double> rate_vector = rate_variable->get_values(leaving_individuals);
+ (leaving_individuals, rate_vector.begin(), rate_vector.end());
+ bitset_sample_multi_internal
+->queue_update(to, leaving_individuals);
+ variable
+}),
+ true
+ );
+ };
The exported function can be used normally in R when creating the +list of processes:
+
+processes <- list(
+ multi_probability_bernoulli_process_cpp(state, "I", "R", prob),
+ other_process_1,
+ other_process_2
+)
That’s everything you need to scale your models up to millions of +individuals!
+The Susceptible-Infectious-Recovered +(SIR) model is the “hello, world!” model for infectious disease +simulations, and here we describe how to build it in “individual”. This +tutorial will illustrate the use of events, processes, and rendering +output.
+To start, we should define some constants. The epidemic will be
+simulated in a population of 1000, where 5 persons are initially
+infectious, whose indices are randomly sampled. The effective contact
+rate \(\beta\) will be a function of
+the deterministic \(R_{0}\) and
+recovery rate \(\gamma\). We also
+specify dt
, which is the size of the time step \((\Delta t)\). Because individual’s time
+steps are all of unit length, we scale transition probabilities by
+dt
to create models with different sized steps,
+interpreting the discrete time model as a discretization of a continuous
+time model. If the maximum time is tmax
then the overall
+number of time steps is tmax/dt
.
+N <- 1e3
+I0 <- 5
+S0 <- N - I0
+dt <- 0.1
+tmax <- 100
+steps <- tmax/dt
+gamma <- 1/10
+R0 <- 2.5
+beta <- R0 * gamma
+health_states <- c("S","I","R")
+health_states_t0 <- rep("S",N)
+health_states_t0[sample.int(n = N,size = I0)] <- "I"
Next, we will define the individual::CategoricalVariable
+which should store the model’s “state”.
+health <- CategoricalVariable$new(categories = health_states,initial_values = health_states_t0)
In order to model infection, we need a process. This is a function
+that takes only a single argument, t
, for the current time
+step (unused here, but can model time-dependent processes, such as
+seasonality or school holiday). Within the function, we get the current
+number of infectious individuals, then calculate the per-capita
+force of infection on each susceptible person, \(\lambda = \beta I/N\). Next we get a
+individual::Bitset
containing those susceptible individuals
+and use the sample
method to randomly select those who will
+be infected on this time step. The probability is given by \(1 - e^{-\lambda\Delta t}\). This is the
+same as the CDF of an exponential random variate so we use
+stats::pexp
to compute that quantity. Finally, we queue a
+state update for those individuals who were sampled.
+infection_process <- function(t){
+ I <- health$get_size_of("I")
+ foi <- beta * I/N
+ S <- health$get_index_of("S")
+ S$sample(rate = pexp(q = foi * dt))
+ health$queue_update(value = "I",index = S)
+}
Now we need to model recovery. For geometrically distributed
+infectious periods, we could use another process that randomly samples
+some individuals each time step to recover, but we’ll use a
+individual::TargetedEvent
to illustrate their use. The
+recovery event is quite simple, and the “listener” which is added is a
+function that is called when the event triggers, taking
+target
, a individual::Bitset
of scheduled
+individuals, as its second argument. Those individuals are scheduled for
+a state update within the listener function body.
+recovery_event <- TargetedEvent$new(population_size = N)
+recovery_event$add_listener(function(t, target) {
+ health$queue_update("R", target)
+})
Finally, we need to define a recovery process that queues future
+recovery events. We first get individual::Bitset
objects of
+those currently infectious individuals and those who have already been
+scheduled for a recovery. Then, using bitwise operations, we get the
+intersection of already infectious persons with persons who have not
+been scheduled, precisely those who need to have recovery times sampled
+and recoveries scheduled. We sample those times from
+stats::rgeom
, where the probability for recovery is \(1-e^{-\gamma \Delta t}\). Note that we add
+one to the resulting vector, because by default R uses a “number of
+failures” parameterization rather than “number of trials”, meaning it
+would be possible for an individual to be infectious for 0 time steps
+without the correction. Finally we schedule the recovery using the
+recovery event object.
We note at this point would be possible to queue the recovery event +at the same time the infection state update was made, but we separate +event and process for illustration of how the package works.
+ +The last thing to do before simulating the model is rendering output
+to plot. We use a individual::Render
object which stores
+output from the model, which is tracked each time step. To do so
+requires another process, for which we use the “prefab”
+individual::categorical_count_renderer_process
.
+health_render <- Render$new(timesteps = steps)
+health_render_process <- categorical_count_renderer_process(
+ renderer = health_render,
+ variable = health,
+ categories = health_states
+)
Finally, the simulation can be run by passing objects to the
+individual::simulation_loop
function:
+simulation_loop(
+ variables = list(health),
+ events = list(recovery_event),
+ processes = list(infection_process,recovery_process,health_render_process),
+ timesteps = steps
+)
We can easily plot the results by accessing the renderer.
+
+states <- health_render$to_dataframe()
+health_cols <- c("royalblue3","firebrick3","darkorchid3")
+matplot(
+ x = states[[1]]*dt, y = states[-1],
+ type="l",lwd=2,lty = 1,col = adjustcolor(col = health_cols, alpha.f = 0.85),
+ xlab = "Time",ylab = "Count"
+)
+legend(
+ x = "topright",pch = rep(16,3),
+ col = health_cols,bg = "transparent",
+ legend = health_states, cex = 1.5
+)
An R package for specifying and simulating individual-based models.
+This package is designed to:
+The package can be installed from github using the “remotes” library
+
+library('remotes')
+install_github('mrc-ide/individual')
Alternatively you can install individual directly from CRAN, but be aware that the CRAN version may not be the most recent version of the package:
+
+install.packages("individual")
For development it is most convenient to run the code from source. You can install the dependencies in RStudio by opening the project and selecting “Build” > “Install and Restart”
+Command line users can execute:
+
+library('remotes')
+install_deps('.', dependencies = TRUE)
Docker users can build a minimal image with
+docker build . -f docker/Dockerfile -t [your image name]
Or if you would like devtools and documentation tools you can run
+docker build . -f docker/Dockerfile.dev -t [your image name]
We recommend first reading vignette("Tutorial")
which describes how to simulate a simple SIR model in “individual”, and later vignette("API")
which describes in detail how to use the data structures in “individual” to build more complicated models. If you are running into performance issues, learn more about how to speed up your model in vignette("Performance")
.
Individual-based models are important tools for infectious disease epidemiology, but practical use requires an implementation that is both comprehensible so that code may be maintained and adapted, and fast. “individual” is an R package which provides users a set of primitive classes using the R6 class system that define elements common to many tasks in infectious disease modeling. Using R6 classes helps ensure that methods invoked on objects are appropriate for that object type, aiding in testing and maintenance of models programmed using “individual”. Computation is carried out in C++ using Rcpp to link to R, helping achieve good performance for even complex models.
+“individual” provides a unique method to specify individual-based models compared to other agent/individual-based modeling libraries, where users specify a type for agents, which are subsequently stored in an array or other data structure. In “individual”, users instead instantiate a object for each variable which describes some aspect of state, using the appropriate R6 class. Finding subsets of individuals with particular combinations of state variables for further computation can be efficiently accomplished with set operations, using a custom bitset class implemented in C++. Additionally, the software makes no assumptions on the types of models that may be simulated (e.g. mass action, network), and updates are performed on a discrete time step.
+We hope our software is useful to infectious disease modellers, ecologists, and others who are interested in individual-based modeling in R.
+Thank you! Please refer to the vignette on vignette("Contributing")
for info on how to contribute :)
NEWS.md
+ tests/performance
PR here
+IterableBitset
object can be run without giving LTO check errors (see src/test-bitset.cpp
)Bitset$clear
to zero out all set bits in bitsets PR here
+DoubleVariable
and IntegerVariable
updates could change size of the variable object PR here
+NEWS.md
file to track changes to the package.Event.h
now defines class methods outside of the class definition for easier readability, and add documentation.TargetedEvent$schedule
now dispatches to different C++ functions in event.cpp
and Event.h
depending on if input index is a bitset or vector (previous behavior used bitset’s tovector method in R to pass a vector).test-event.R
now only contains tests for Event
class, new test file test-targetedevent.R
contains a much updated suite of tests for the TargetedEvent
class.CategoricalVariable
could be queued updates for indices in a vector that were outside the range of the population.Bitset$not
to operate in place. inplace = FALSE will be deprecated in 0.2.0Bitset
for argument index
, queue_update
methods for IntegerVariable
and DoubleVariable
pass the bitset directly to the C++ functions integer_variable_queue_update_bitset
and double_variable_queue_update_bitset
rather than converting to vector and using vector methods.CategoricalVariable.h
, IntegerVariable.h
, and DoubleVariable.h
now define class methods outside of the class definition for easier readability, and add documentation.CategoricalVariable
, IntegerVariable
, and DoubleVariable
classes define a virtual destructor with default implementation.get_index_of_set
and get_size_of_set_vector
methods for IntegerVariable
now pass arguments by reference.get_values
method for IntegerVariable
and DoubleVariable
corrected to return value rather than reference.get_values
for IntegerVariable
and DoubleVariable
to accept std::vector<size_t>
as argument rather than converting to bitset.bitset_to_vector_internal
to IterableBitset.h
.testthat/test/test-variables.R
into testthat/test/test-categoricalvariable.R
, testthat/test/test-integervariable.R
, and testthat/test/test-doublevariable.R
+#include
statements from header files.size_t
types.This is a data structure that compactly stores the presence of
+integers in some finite set (max_size
), and can
+efficiently perform set operations (union, intersection, complement, symmetric
+difference, set difference).
+WARNING: All operations are in-place so please use $copy
+if you would like to perform an operation without destroying your current bitset.
.bitset
a pointer to the underlying IterableBitset.
max_size
the maximum size of the bitset.
remove()
remove from the bitset.
+ + + +xor()
to "bitwise xor" or get the symmetric difference of two bitset +(keep elements in either bitset but not in their intersection).
+ + + +set_difference()
Take the set difference of this bitset with another
+(keep elements of this bitset which are not in other
).
sample()
sample a bitset.
+ + + +choose()
choose k random items in the bitset
+ + + +Represents a categorical variable for an individual.
+This class should be used for discrete variables taking values in
+a finite set, such as infection, health, or behavioral state. It should
+be used in preference to IntegerVariable
+if possible because certain operations will be faster.
new()
Create a new CategoricalVariable
CategoricalVariable$new(categories, initial_values)
get_index_of()
return a Bitset
for individuals with the given values
get_categories()
return a character vector of possible values. +Note that the order of the returned vector may not be the same order +that was given when the variable was intitialized, due to the underlying +unordered storage type.
+ + +queue_update()
queue an update for this variable
+ +value
the new value
index
the indices of individuals whose value will be updated
+to the one specified in value
. This may be either a vector of integers or
+a Bitset
.
Represents a continuous variable for an individual.
+get_values()
get the variable values.
+ +index
optionally return a subset of the variable vector. If
+NULL
, return all values; if passed a Bitset
+or integer vector, return values of those individuals.
get_index_of()
return a Bitset
giving individuals
+whose value lies in an interval \([a,b]\).
get_size_of()
return the number of individuals whose value lies in an interval +Count individuals whose value lies in an interval \([a,b]\).
+ + + +queue_update()
Queue an update for a variable. There are 4 types of variable update:
Subset update: The argument index
represents a subset of the variable to
+update. The argument values
should be a vector whose length matches the size of index
,
+which represents the new values for that subset.
Subset fill: The argument index
represents a subset of the variable to
+update. The argument values
should be a single number, which fills the specified subset.
Variable reset: The index vector is set to NULL
and the argument values
+replaces all of the current values in the simulation. values
should be a vector
+whose length should match the size of the population, which fills all the variable values in
+the population
Variable fill: The index vector is set to NULL
and the argument values
+should be a single number, which fills all of the variable values in
+the population.
values
a vector or scalar of values to assign at the index.
index
is the index at which to apply the change, use NULL
for the
+fill options. If using indices, this may be either a vector of integers or
+a Bitset
.
Describes a general event in the simulation.
+individual::EventBase
-> Event
Represents a integer valued variable for an individual.
+This class is similar to CategoricalVariable
,
+but can be used for variables with unbounded ranges, or other situations where part
+of an individual's state is better represented by an integer, such as
+household or age bin.
get_values()
Get the variable values.
+ +index
optionally return a subset of the variable vector. If
+NULL
, return all values; if passed a Bitset
+or integer vector, return values of those individuals.
get_index_of()
Return a Bitset
for individuals with some subset of values.
+Either search for indices corresponding to values in set
, or
+for indices corresponding to values in range \([a,b]\). Either set
+or a
and b
must be provided as arguments.
get_size_of()
Return the number of individuals with some subset of values.
+Either search for indices corresponding to values in set
, or
+for indices corresponding to values in range \([a,b]\). Either set
+or a
and b
must be provided as arguments.
queue_update()
Queue an update for a variable. There are 4 types of variable update:
+Subset update: The argument index
represents a subset of the variable to
+update. The argument values
should be a vector whose length matches the size of index
,
+which represents the new values for that subset.
Subset fill: The argument index
represents a subset of the variable to
+update. The argument values
should be a single number, which fills the specified subset.
Variable reset: The index vector is set to NULL
and the argument values
+replaces all of the current values in the simulation. values
should be a vector
+whose length should match the size of the population, which fills all the variable values in
+the population
Variable fill: The index vector is set to NULL
and the argument values
+should be a single number, which fills all of the variable values in
+the population.
values
a vector or scalar of values to assign at the index
index
is the index at which to apply the change, use NULL
for the
+fill options. If using indices, this may be either a vector of integers or
+a Bitset
.
This is a ragged array which stores doubles (numeric values).
+get_values()
Get the variable values.
+ +index
optionally return a subset of the variable vector. If
+NULL
, return all values; if passed an Bitset
+or integer vector, return values of those individuals.
get_length()
Get the lengths of the indiviudal elements in the ragged array
+ +index
optionally only get lengths for a subset of persons. If
+NULL
, return all lengths; if passed an Bitset
+or integer vector, return lengths of arrays for those individuals.
queue_update()
Queue an update for a variable. There are 4 types of variable update:
+Subset update: The argument index
represents a subset of the variable to
+update. The argument values
should be a vector whose length matches the size of index
,
+which represents the new values for that subset.
Subset fill: The argument index
represents a subset of the variable to
+update. The argument values
should be a single number, which fills the specified subset.
Variable reset: The index vector is set to NULL
and the argument values
+replaces all of the current values in the simulation. values
should be a vector
+whose length should match the size of the population, which fills all the variable values in
+the population
Variable fill: The index vector is set to NULL
and the argument values
+should be a single number, which fills all of the variable values in
+the population.
values
a list of numeric vectors
index
is the index at which to apply the change, use NULL
for the
+fill options. If using indices, this may be either a vector of integers or
+an Bitset.
This is a ragged array which stores integers (numeric values).
+get_values()
Get the variable values.
+ +index
optionally return a subset of the variable vector. If
+NULL
, return all values; if passed an Bitset
+or integer vector, return values of those individuals.
get_length()
Get the lengths of the indiviudal elements in the ragged array
+ +index
optionally only get lengths for a subset of persons. If
+NULL
, return all lengths; if passed an Bitset
+or integer vector, return lengths of arrays for those individuals.
queue_update()
Queue an update for a variable. There are 4 types of variable update:
+Subset update: The argument index
represents a subset of the variable to
+update. The argument values
should be a vector whose length matches the size of index
,
+which represents the new values for that subset.
Subset fill: The argument index
represents a subset of the variable to
+update. The argument values
should be a single number, which fills the specified subset.
Variable reset: The index vector is set to NULL
and the argument values
+replaces all of the current values in the simulation. values
should be a vector
+whose length should match the size of the population, which fills all the variable values in
+the population
Variable fill: The index vector is set to NULL
and the argument values
+should be a single number, which fills all of the variable values in
+the population.
values
a list of numeric vectors
index
is the index at which to apply the change, use NULL
for the
+fill options. If using indices, this may be either a vector of integers or
+an Bitset.
Class to render output for the simulation.
+new()
Initialise a renderer for the simulation, creates the default state +renderers.
Render$new(timesteps)
Describes a targeted event in the simulation. +This is useful for events which are triggered for a sub-population.
+individual::EventBase
-> TargetedEvent
schedule()
Schedule this event to occur in the future.
+ +target
the individuals to pass to the listener, this may be +either a vector of integers or a Bitset.
delay
the number of time steps to wait before triggering the event, +can be a scalar in which case all targeted individuals are scheduled for +for the same delay or a vector of values giving the delay for that +individual.
get_scheduled()
Get the individuals who are scheduled as a Bitset.
+ + +clear_schedule()
Stop a future event from triggering for a subset of individuals.
+ +target
the individuals to clear, this may be either a vector of integers or +a Bitset.
queue_extend_with_schedule()
Extend the target size and schedule for the new population.
+ + + +Simulate a process where individuals in a given from
state
+advance to the to
state each time step with probability rate
.
bernoulli_process(variable, from, to, rate)
a categorical variable.
a string representing the source category.
a string representing the destination category.
the probability to move individuals between categories.
a function which can be passed as a process to simulation_loop
.
Renders the number of individuals in each category.
+categorical_count_renderer_process(renderer, variable, categories)
a Render
object.
a CategoricalVariable
object.
a character vector of categories to render.
a function which can be passed as a process to simulation_loop
.
Save the simulation state in an R object, allowing it to be
+resumed later using restore_state
.
checkpoint_state(timesteps, variables, events)
<- the number of time steps that have already been simulated
the list of Variables
the list of Events
This non-modifying function returns a new Bitset
+object of the same maximum size as the original but which only contains
+those values at the indices specified by the argument other
.
+Indices in other
may be specified either as a vector of integers or as
+another bitset. Please note that filtering by another bitset is not a
+"bitwise and" intersection, and will have the same behavior as providing
+an equivalent vector of integer indices.
filter_bitset(bitset, other)
Simulates a two-stage process where all individuals
+in a given source_state
sample whether to leave or not with probability
+rate
; those who leave go to one of the destination_states
with
+probabilities contained in the vector destination_probabilities
.
fixed_probability_multinomial_process(
+ variable,
+ source_state,
+ destination_states,
+ rate,
+ destination_probabilities
+)
a CategoricalVariable
object.
a string representing the source state.
a vector of strings representing the destination states.
probability of individuals in source state to leave.
probability vector of destination states.
a function which can be passed as a process to simulation_loop
.
+ Variables+Variables are how individual represents state, and operations to find subsets of individuals based on values of their variable(s) return bitsets. + |
+ |
---|---|
+ + | +CategoricalVariable Class |
+
+ + | +IntegerVariable Class |
+
+ + | +DoubleVariable Class |
+
+ + | +RaggedInteger Class |
+
+ + | +RaggedDouble Class |
+
+ + | +A Bitset Class |
+
+ + | +Filter a bitset |
+
+ Events & Rendering+Classes for events and rendering output. + |
+ |
+ + | +EventBase Class |
+
+ + | +Event Class |
+
+ + | +TargetedEvent Class |
+
+ + | +Render |
+
+ Prefabs+Optimized processes and event listeners. + |
+ |
+ + | +Bernoulli process |
+
+ + | +Render Categories |
+
+ + | +Multinomial process |
+
+ + | +Infection process for age-structured models |
+
+ + | +Overdispersed Bernoulli process |
+
+ + | +Overdispersed multinomial process |
+
+ + | +Reschedule listener |
+
+ + | +Update category listener |
+
+ Simulation+ + |
+ |
+ + | +A premade simulation loop |
+
+ + | +Save the simulation state |
+
+ + | +Restore the simulation state |
+
Simulates infection for age-structured models, where +individuals contact each other at a rate given by some mixing (contact) matrix. +The force of infection on susceptibles in a given age class is computed as: +$$\lambda_{i} = p \sum\limits_{j} C_{i,j} \left( \frac{I_{j}}{N_{j}} \right) $$ +Where \(C\) is the matrix of contact rates, \(p\) is the probability of infection +per contact. The per-capita probability of infection for susceptible individuals is then: +$$1 - e^{-\lambda_{i} \Delta t}$$.
+infection_age_process(
+ state,
+ susceptible,
+ exposed,
+ infectious,
+ age,
+ age_bins,
+ p,
+ dt,
+ mixing
+)
a CategoricalVariable
object.
a string representing the susceptible state (usually "S").
a string representing the state new infections go to (usually "E" or "I").
a string representing the infected and infectious state (usually "I").
a IntegerVariable
giving the age of each individual.
the total number of age bins (groups).
the probability of infection given a contact.
the size of the time step (in units relative to the contact rates in mixing
).
a mixing (contact) matrix between age groups.
a function which can be passed as a process to simulation_loop
.
Simulates a Bernoulli process where all individuals
+in a given source state from
sample whether or not
+to transition to destination state to
with a
+individual probability specified by the DoubleVariable
+object rate_variable
.
multi_probability_bernoulli_process(variable, from, to, rate_variable)
a CategoricalVariable
object.
a string representing the source state.
a string representing the destination state.
DoubleVariable
giving individual probability of each individual in source state to leave.
a function which can be passed as a process to simulation_loop
.
R/prefab.R
+ multi_probability_multinomial_process.Rd
Simulates a two-stage process where all individuals
+in a given source_state
sample whether to leave or not with a
+individual probability specified by the DoubleVariable
+object rate_variable
; those who leave go to one of the destination_states
with
+probabilities contained in the vector destination_probabilities
.
multi_probability_multinomial_process(
+ variable,
+ source_state,
+ destination_states,
+ rate_variable,
+ destination_probabilities
+)
a CategoricalVariable
object.
a string representing the source state.
a vector of strings representing the destination states.
DoubleVariable
giving individual probability of each individual in source state to leave
probability vector of destination states.
a function which can be passed as a process to simulation_loop
.
Schedules a follow-up event as the result of an event +firing.
+reschedule_listener(event, delay)
the delay until the follow-up event.
Restore the simulation state from a previous checkpoint. +The state of passed events and variables is overwritten to match the state they +had when the simulation was checkpointed. Returns the time step at which the +simulation should resume.
+restore_state(state, variables, events, restore_random_state)
the simulation state to restore, as returned by restore_state
.
the list of Variables
the list of Events
if TRUE, restore R's global random number generator's state from the checkpoint.
Run a simulation where event listeners take precedence +over processes for state changes.
+a list of Variables
a list of Events
a list of processes to execute on each timestep
the end timestep of the simulation. If state
is not NULL, timesteps must be greater than state$timestep
a checkpoint from which to resume the simulation
if TRUE, restore R's global random number generator's state from the checkpoint.
population <- 4
+timesteps <- 5
+state <- CategoricalVariable$new(c('S', 'I', 'R'), rep('S', population))
+renderer <- Render$new(timesteps)
+
+transition <- function(from, to, rate) {
+ return(function(t) {
+ from_state <- state$get_index_of(from)
+ state$queue_update(
+ to,
+ from_state$sample(rate)
+ )
+ })
+}
+
+processes <- list(
+ transition('S', 'I', .2),
+ transition('I', 'R', .1),
+ transition('R', 'S', .05),
+ categorical_count_renderer_process(renderer, state, c('S', 'I', 'R'))
+)
+
+simulation_loop(variables=list(state), processes=processes, timesteps=timesteps)
+renderer$to_dataframe()
+#> timestep S_count I_count R_count
+#> 1 1 4 0 0
+#> 2 2 4 0 0
+#> 3 3 3 1 0
+#> 4 4 2 2 0
+#> 5 5 2 2 0
+
Updates the category of a sub-population as the result of an
+event firing, to be used in the TargetedEvent
+class.
update_category_listener(variable, to)
a CategoricalVariable
object.
a string representing the destination category.