+
+The article describes how to use the {epireview} R package with
+{epiparameter}. {epireview} provides epidemiological parameters for a
+range of pathogens extracted from the literature by the Pathogen
+Epidemiology Review Group (PERG) in systematic reviews. The
+{epiparameter} R package can use these epidemiological parameters and
+can convert them into <epidist>
objects.
+
+
This feature of interoperability of {epireview} and {epiparameter} is
+still under development. Currently only delay distributions (in
+{epireview} termed Human delay) are compatible with this feature.
+Attempting to convert other epidemiological parameters from {epireview}
+to {epiparameter} <epidist>
objects will likely error
+with an uninformative error message.
+
+
+
Converting from {epireview} entries into an
+<epidist>
object
+
+
The {epireview} package nicely provides epidemiological parameter
+data from systematically reviewing the literature, and {epiparameter}
+provides custom data structures for working with epidemiological data in
+R. Therefore, reading in data from the {epireview} R package and
+converting it to an <epidist>
object will hopefully
+provide the greatest utility when applied in outbreak analytics.
+
Here we start we a simple example of reading in the Marburg data from
+{epireview} and converting to an <epidist>
object
+using as_epidist()
function from the {epiparameter}
+package.
+
+marburg_data <- epireview::load_epidata("marburg")
+#> Warning in pretty_article_label(articles, mark_multiple): There are 1 articles
+#> with missing first author surname.
+#> Warning in pretty_article_label(articles, mark_multiple): There are 1 articles
+#> with missing first author surname and first author first name.
+#> Warning in pretty_article_label(articles, mark_multiple): There are 1 articles
+#> with missing year of publication.
+#> Data loaded for marburg
+
This loads a list with four tables, specifically
+tibbles
, that contain the bibliographic information
+($articles
), epidemiological parameters
+($params
), epidemiological models ($models
),
+and outbreak information ($outbreaks
).
+
+names(marburg_data)
+#> [1] "articles" "params" "models" "outbreaks"
+
We will start by just using the epidemiological parameter table to
+convert information into an <epidist>
.
+
+marburg_params <- marburg_data$params
+
Given that currently only delay distributions are supported for the
+conversion (this feature is still under active development) we will
+filter to only include these.
+
+delay_dist_rows <- grepl(
+ pattern = "Human delay",
+ x = marburg_params$parameter_type,
+ fixed = TRUE
+)
+marburg_params <- marburg_params[delay_dist_rows, ]
+marburg_params
+#> # A tibble: 14 × 59
+#> parameter_data_id article_id parameter_type parameter_value parameter_unit
+#> <chr> <int> <chr> <dbl> <chr>
+#> 1 ace3e3fc6f40bef6da4… 6 Human delay -… 4 Days
+#> 2 c2a35e68034b7258065… 6 Human delay -… NA Days
+#> 3 cef9471d52ca35358d7… 6 Human delay -… 9 Days
+#> 4 0106582cf5ed3c52d5e… 20 Human delay -… NA Days
+#> 5 056a8d6b5f9aee3622d… 27 Human delay -… 9 Days
+#> 6 ce3976e2e15df3f6fb9… 27 Human delay -… 5.4 Days
+#> 7 3bf73665fa67a6ba7f7… 27 Human delay -… 7 Days
+#> 8 ba019f18acac9c5b0b7… 27 Human delay -… 9.3 Days
+#> 9 71798b4154011dcd008… 27 Human delay -… 9 Days
+#> 10 b91f018b5ca72c9293d… 42 Human delay -… 9 Days
+#> 11 e95844c29c00b70dee0… 42 Human delay -… 4 Days
+#> 12 415a27bceff2afce958… 42 Human delay -… 22 Days
+#> 13 7ce1ea4e35a494de06a… 42 Human delay -… 14.3 Days
+#> 14 0eb36537ee9159b12d2… 57 Human delay -… 4.5 Days
+#> # ℹ 54 more variables: parameter_lower_bound <dbl>,
+#> # parameter_upper_bound <dbl>, parameter_value_type <chr>,
+#> # parameter_uncertainty_single_value <dbl>,
+#> # parameter_uncertainty_singe_type <chr>,
+#> # parameter_uncertainty_lower_value <dbl>,
+#> # parameter_uncertainty_upper_value <dbl>, parameter_uncertainty_type <chr>,
+#> # cfr_ifr_numerator <int>, cfr_ifr_denominator <int>, …
+
We will select the second entry, which is an incubation period, to
+use as the first example:
+
+marburg_incub <- marburg_params[2, ]
+marburg_incub
+#> # A tibble: 1 × 59
+#> parameter_data_id article_id parameter_type parameter_value parameter_unit
+#> <chr> <int> <chr> <dbl> <chr>
+#> 1 c2a35e68034b72580654… 6 Human delay -… NA Days
+#> # ℹ 54 more variables: parameter_lower_bound <dbl>,
+#> # parameter_upper_bound <dbl>, parameter_value_type <chr>,
+#> # parameter_uncertainty_single_value <dbl>,
+#> # parameter_uncertainty_singe_type <chr>,
+#> # parameter_uncertainty_lower_value <dbl>,
+#> # parameter_uncertainty_upper_value <dbl>, parameter_uncertainty_type <chr>,
+#> # cfr_ifr_numerator <int>, cfr_ifr_denominator <int>, …
+
Then we can simply pass our epidemiological parameter set to
+as_epidist()
to do the conversion.
+
+marburg_incub_epidist <- as_epidist(marburg_incub)
+#> Using Gear (1975). "<title not available>." _<journal not available>_.
+#> To retrieve the citation use the 'get_citation' function
+#> Warning: Cannot create full citation for epidemiological parameters without bibliographic information
+#> see ?epireview_to_epidist for help.
+#> No adequate summary statistics available to calculate the parameters of the NA distribution
+#> Unparameterised <epidist> object
+marburg_incub_epidist
+#> Disease: Marburg Virus Disease
+#> Pathogen: Marburg virus
+#> Epi Distribution: incubation period
+#> Study: Gear (1975). "<title not available>." _<journal not available>_.
+#> Distribution: NA
+
The resulting <epidist>
does not contain a
+parameterised probability distribution, instead it contains a range for
+the incubation period (in $summary_stats
), and the
+$metadata
shows that this is a single case from South
+Africa.
+
+marburg_incub_epidist$summary_stats
+#> $mean
+#> [1] NA
+#>
+#> $mean_ci_limits
+#> [1] NA NA
+#>
+#> $mean_ci
+#> [1] NA
+#>
+#> $sd
+#> [1] NA
+#>
+#> $sd_ci_limits
+#> [1] NA NA
+#>
+#> $sd_ci
+#> [1] NA
+#>
+#> $median
+#> [1] NA
+#>
+#> $median_ci_limits
+#> [1] NA NA
+#>
+#> $median_ci
+#> [1] NA
+#>
+#> $quantiles
+#> [1] NA
+#>
+#> $range
+#> [1] 7 8
+marburg_incub_epidist$metadata
+#> $sample_size
+#> [1] 1
+#>
+#> $region
+#> [1] "Johannesburg"
+#>
+#> $transmission_mode
+#> [1] NA
+#>
+#> $vector
+#> [1] NA
+#>
+#> $extrinsic
+#> [1] FALSE
+#>
+#> $inference_method
+#> [1] NA
+
+
+
Creating an <epidist>
with full citation
+
+
The last example showed how to convert the epidemiological parameter
+information, however, you may have noticed that the citation that was
+created did not contain the information for a full citation.
+
+marburg_incub_epidist$citation
+#> Gear (1975). "<title not available>." _<journal not available>_.
+
In order to provide a complete citation to the
+<epidist>
object, which is highly recommended so that
+you know the provenance of the parameters and can correctly attribute
+the original authors, we will need to also provide the bibliographic
+information from {epireview} as well as the epidemiological
+parameters.
+
The article data needs to be loaded from {epireview} using
+epireview::load_epidata_raw()
rather than
+epireview::load_data()
because load_data()
+subsets the bibliographic information to only provide:
+"id"
, "first_author_surname"
,
+"year_publication"
, and "article_label"
+columns.
+
+marburg_articles <- epireview::load_epidata_raw(
+ pathogen = "marburg",
+ table = "article"
+)
+marburg_articles
+#> # A tibble: 58 × 25
+#> article_id pathogen covidence_id first_author_first_n…¹ article_title doi
+#> <dbl> <chr> <int> <chr> <chr> <chr>
+#> 1 1 Marburg v… 2059 G A Haemorrhagic… NA
+#> 2 2 Marburg v… 2042 Christian Antibodies t… NA
+#> 3 3 Marburg v… 1649 Y The origin a… 10.1…
+#> 4 4 Marburg v… 1692 D.H. Marburg-Viru… NA
+#> 5 5 Marburg v… 2597 E. D. Filovirus ac… NA
+#> 6 6 Marburg v… 3795 JS Outbreak of … 10.1…
+#> 7 7 Marburg v… 2596 E.D. Haemorrhagic… NA
+#> 8 8 Marburg v… 1615 O Viral hemorr… 10.4…
+#> 9 9 Marburg v… 1693 Smiley Suspected Ex… 10.1…
+#> 10 10 Marburg v… 1692 D Marburg-viru… NA
+#> # ℹ 48 more rows
+#> # ℹ abbreviated name: ¹first_author_first_name
+#> # ℹ 19 more variables: journal <chr>, year_publication <int>, volume <int>,
+#> # issue <int>, page_first <int>, page_last <int>, paper_copy_only <lgl>,
+#> # notes <chr>, first_author_surname <chr>, double_extracted <dbl>,
+#> # qa_m1 <int>, qa_m2 <int>, qa_a3 <int>, qa_a4 <int>, qa_d5 <int>,
+#> # qa_d6 <int>, qa_d7 <int>, score <dbl>, id <chr>
+
We need to match the entry in the epidemiological parameter table
+with the citation information in the article table to ensure we are
+using the correct citation for the parameter set. Thankfully, this can
+easily be achieved as {epireview} provides unique IDs to each table to
+link entries.
+
+article_row <- match(marburg_incub$id, marburg_articles$id)
+article_row
+#> [1] 6
+
+marburg_incub_article <- marburg_articles[article_row, ]
+marburg_incub_article
+#> # A tibble: 1 × 25
+#> article_id pathogen covidence_id first_author_first_n…¹ article_title doi
+#> <dbl> <chr> <int> <chr> <chr> <chr>
+#> 1 6 Marburg vi… 3795 JS Outbreak of … 10.1…
+#> # ℹ abbreviated name: ¹first_author_first_name
+#> # ℹ 19 more variables: journal <chr>, year_publication <int>, volume <int>,
+#> # issue <int>, page_first <int>, page_last <int>, paper_copy_only <lgl>,
+#> # notes <chr>, first_author_surname <chr>, double_extracted <dbl>,
+#> # qa_m1 <int>, qa_m2 <int>, qa_a3 <int>, qa_a4 <int>, qa_d5 <int>,
+#> # qa_d6 <int>, qa_d7 <int>, score <dbl>, id <chr>
+
Now we can repeat the example of converting to
+<epidist>
as shown above, but this time pass the
+bibliographic information as well as the epidemiological parameter
+information to create a full citation. The bibliographic information
+needs to be passed with the articles
argument.
+
+marburg_incub_epidist <- as_epidist(
+ marburg_incub,
+ articles = marburg_incub_article
+)
+#> Using Gear (1975). "Outbreak of Marburg virus disease in Johannesburg." _The
+#> British Medical Journal_. doi:10.1136/bmj.4.5995.489
+#> <https://doi.org/10.1136/bmj.4.5995.489>.
+#> To retrieve the citation use the 'get_citation' function
+#> No adequate summary statistics available to calculate the parameters of the NA distribution
+#> Unparameterised <epidist> object
+marburg_incub_epidist
+#> Disease: Marburg Virus Disease
+#> Pathogen: Marburg virus
+#> Epi Distribution: incubation period
+#> Study: Gear (1975). "Outbreak of Marburg virus disease in Johannesburg." _The
+#> British Medical Journal_. doi:10.1136/bmj.4.5995.489
+#> <https://doi.org/10.1136/bmj.4.5995.489>.
+#> Distribution: NA
+
+marburg_incub_epidist$citation
+#> Gear (1975). "Outbreak of Marburg virus disease in Johannesburg." _The
+#> British Medical Journal_. doi:10.1136/bmj.4.5995.489
+#> <https://doi.org/10.1136/bmj.4.5995.489>.
+
+
The as_epidist()
function is an S3 generic. If you are
+not familiar with S3 object-oriented programming in R, then this detail
+is not important, however, it does mean that the articles
+argument is not explicitly in the function definition of
+as_epidist()
(i.e. it will not show up on autocomplete when
+typing out the function and will not be shown if you read the function
+help page ?as_epidist()
). Instead, the argument is
+specified as part of the ...
argument. This is because the
+articles
argument is only required when converting data
+from {epireview} into <epidist>
, and other data that
+can be converted to <epidist>
objects do not require
+this argument.
+
+
+
+
Multi-row {epireview} entries
+
+
The way the {epireview} data is stored means that some
+epidemiological parameter entries require multiple rows. This can be,
+for example, because they contain two summary statistics (e.g. mean and
+standard deviation) that are kept on separate rows. In order to create
+<epidist>
objects that contains the full information
+for each entry multiple rows of the epidemiological parameters table
+from {epireview} can be given to as_epidist()
to create a
+single <epidist>
object.
+
We can search in the data which entries have multiple rows by
+checking if there are duplicated parameter types and IDs. Remembering
+that we are using only the delay distributions (i.e. Human delay) from
+the epidemiological parameters, that we subset above.
+
+multi_row_entries <- duplicated(marburg_params$parameter_type) &
+ duplicated(marburg_params$id)
+multi_row_ids <- marburg_params$id[multi_row_entries]
+
+multi_row_marburg_params <-
+ marburg_params[marburg_params$id %in% multi_row_ids, ]
+
This step should be verified manually to ensure that the entries that
+have been selected are indeed multiple rows for the same reported
+epidemiological parameter. We use the first two rows of this subset
+table, which are the mean and standard deviation for the generation time
+of Marburg disease.
+
+marburg_gt <- multi_row_marburg_params[1:2, ]
+
We can now convert this to an <epidist>
.
+
+marburg_gt_epidist <- as_epidist(marburg_gt)
+#> Using Ajelli (2012). "<title not available>." _<journal not available>_.
+#> To retrieve the citation use the 'get_citation' function
+#> Warning: Cannot create full citation for epidemiological parameters without bibliographic information
+#> see ?epireview_to_epidist for help.
+#> No adequate summary statistics available to calculate the parameters of the NA distribution
+#> Unparameterised <epidist> object
+marburg_gt_epidist
+#> Disease: Marburg Virus Disease
+#> Pathogen: Marburg virus
+#> Epi Distribution: generation time
+#> Study: Ajelli (2012). "<title not available>." _<journal not available>_.
+#> Distribution: NA
+marburg_gt_epidist$summary_stats
+#> $mean
+#> [1] 9
+#>
+#> $mean_ci_limits
+#> [1] 8.2 10.0
+#>
+#> $mean_ci
+#> [1] 95
+#>
+#> $sd
+#> [1] 5.4
+#>
+#> $sd_ci_limits
+#> [1] 3.9 8.6
+#>
+#> $sd_ci
+#> [1] 95
+#>
+#> $median
+#> [1] NA
+#>
+#> $median_ci_limits
+#> [1] NA NA
+#>
+#> $median_ci
+#> [1] NA
+#>
+#> $quantiles
+#> [1] NA
+#>
+#> $range
+#> [1] NA NA
+
+
+
Entries with probability distributions
+
+
For this example we will load the Ebola epidemiological parameters
+from the {epireview} package (as there are no entries for Marburg that
+have parametric distributions).
+
+ebola_data <- epireview::load_epidata("ebola")
+#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
+#> e.g.:
+#> dat <- vroom(...)
+#> problems(dat)
+#> Warning in load_epidata_raw(pathogen, "outbreak"): No data found for ebola
+#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
+#> e.g.:
+#> dat <- vroom(...)
+#> problems(dat)
+#> Warning in epireview::load_epidata("ebola"): No outbreaks information found for
+#> ebola
+#> Data loaded for ebola
+
We will again subset the data to just use the epidemiological
+parameter table, and subset that table to just the delay
+distributions.
+
+ebola_params <- ebola_data$params
+
+delay_dist_rows <- grepl(
+ pattern = "Human delay",
+ x = ebola_params$parameter_type,
+ fixed = TRUE
+)
+ebola_params <- ebola_params[delay_dist_rows, ]
+ebola_params
+#> # A tibble: 395 × 77
+#> id parameter_data_id covidence_id pathogen parameter_type parameter_value
+#> <chr> <chr> <int> <chr> <chr> <dbl>
+#> 1 b6168… 6f5cb18602d0dfec… 30 Ebola v… Human delay -… NA
+#> 2 b6168… 55766cedfbf75a9c… 30 Ebola v… Human delay -… NA
+#> 3 0a142… 3ae1f6b55d0f1cc5… 41 Ebola v… Human delay -… 9.2
+#> 4 0a142… f0a0191af0663265… 41 Ebola v… Human delay -… 5.8
+#> 5 0a142… a83683d0b55750df… 41 Ebola v… Human delay -… 10.6
+#> 6 99d84… f4aaae2be84dee6c… 57 Ebola v… Human delay -… 8.6
+#> 7 99d84… 1ea610e2489a3257… 57 Ebola v… Human delay -… 4.4
+#> 8 99d84… 4316d44cbd32d365… 57 Ebola v… Human delay -… 14.4
+#> 9 99d84… 7dfc17eab39f9aa0… 57 Ebola v… Human delay -… 8.6
+#> 10 99d84… c9c3fc483d332934… 57 Ebola v… Human delay -… 10.6
+#> # ℹ 385 more rows
+#> # ℹ 71 more variables: exponent <dbl>, parameter_unit <chr>,
+#> # parameter_lower_bound <dbl>, parameter_upper_bound <dbl>,
+#> # parameter_value_type <chr>, parameter_uncertainty_single_value <dbl>,
+#> # parameter_uncertainty_singe_type <chr>,
+#> # parameter_uncertainty_lower_value <dbl>,
+#> # parameter_uncertainty_upper_value <dbl>, …
+
We will select the 358th entry, which is a serial interval, as this
+entry has estimated and reported a Weibull distribution:
+
+ebola_si <- ebola_params[358, ]
+ebola_si
+#> # A tibble: 1 × 77
+#> id parameter_data_id covidence_id pathogen parameter_type parameter_value
+#> <chr> <chr> <int> <chr> <chr> <dbl>
+#> 1 b76dcc… 0c3e02f80addfccc… 17730 Ebola v… Human delay -… 12
+#> # ℹ 71 more variables: exponent <dbl>, parameter_unit <chr>,
+#> # parameter_lower_bound <dbl>, parameter_upper_bound <dbl>,
+#> # parameter_value_type <chr>, parameter_uncertainty_single_value <dbl>,
+#> # parameter_uncertainty_singe_type <chr>,
+#> # parameter_uncertainty_lower_value <dbl>,
+#> # parameter_uncertainty_upper_value <dbl>, parameter_uncertainty_type <chr>,
+#> # cfr_ifr_numerator <int>, cfr_ifr_denominator <int>, …
+
We can now convert this to an <epidist>
+object.
+
+ebola_si_epidist <- as_epidist(ebola_si)
+#> Using Marziano (2023). "<title not available>." _<journal not available>_.
+#> To retrieve the citation use the 'get_citation' function
+#> Warning: Cannot create full citation for epidemiological parameters without bibliographic information
+#> see ?epireview_to_epidist for help.
+ebola_si_epidist
+#> Disease: Ebola Virus Disease
+#> Pathogen: Ebola virus
+#> Epi Distribution: serial interval
+#> Study: Marziano (2023). "<title not available>." _<journal not available>_.
+#> Distribution: weibull
+#> Parameters:
+#> shape: 1.760
+#> scale: 10.140
+
With the probability distribution for the serial interval we can
+utilise some of the <epidist>
methods. Here we
+illustrate this by checking that the <epidist>
is
+parameterised, plotting the PDF and CDF, and generating 10 random
+numbers sampling from the distribution.
+
+
+plot(ebola_si_epidist, day_range = 0:50)
+
+
+generate(ebola_si_epidist, times = 10)
+#> [1] 17.129838 3.840887 6.913068 14.383495 25.033088 8.693642 8.263781
+#> [8] 11.451308 5.219772 4.697018
+
+
+