Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sum loglikelihoods before exponentiating #105

Merged
merged 1 commit into from
Nov 10, 2023
Merged

Conversation

Bisaloo
Copy link
Member

@Bisaloo Bisaloo commented Nov 8, 2023

Order of operations at the end of likelihood() has been switched, which highlighted the opportunity to use a vectorized exp() call in some cases.

The new code should be:

Faster (when log = FALSE, individual = FALSE)

library(microbenchmark)
library(epichains)

# randomly generate 20 chains of size 1 to 10
chain_sizes <- sample(1:10, 100, replace = TRUE)

microbenchmark(
  likelihood(
    chains = chain_sizes, statistic = "size",
    offspring_dist = "pois", nsim_obs = 1e4, lambda = 0.5, obs_prob = 0.5
  ),
  likelihood_old(
    chains = chain_sizes, statistic = "size",
    offspring_dist = "pois", nsim_obs = 1e4, lambda = 0.5, obs_prob = 0.5
  ),
  times = 1e3
)
#> Unit: milliseconds
#>                                                                                                                                    expr
#>      likelihood(chains = chain_sizes, statistic = "size", offspring_dist = "pois",      nsim_obs = 10000, lambda = 0.5, obs_prob = 0.5)
#>  likelihood_old(chains = chain_sizes, statistic = "size", offspring_dist = "pois",      nsim_obs = 10000, lambda = 0.5, obs_prob = 0.5)
#>       min       lq     mean   median       uq      max neval
#>  273.8103 312.9286 341.6483 332.5475 357.6596 609.6394  1000
#>  277.9916 312.7059 340.9575 332.5310 357.6125 589.5059  1000

microbenchmark(
  likelihood(
    chains = chain_sizes, statistic = "size",
    offspring_dist = "pois", nsim_obs = 1e4, lambda = 0.5, log = FALSE, obs_prob = 0.5
  ),
  likelihood_old(
    chains = chain_sizes, statistic = "size",
    offspring_dist = "pois", nsim_obs = 1e4, lambda = 0.5, log = FALSE, obs_prob = 0.5
  ),
  times = 1e3
)
#> Unit: milliseconds
#>                                                                                                                                                 expr
#>      likelihood(chains = chain_sizes, statistic = "size", offspring_dist = "pois",      nsim_obs = 10000, lambda = 0.5, log = FALSE, obs_prob = 0.5)
#>  likelihood_old(chains = chain_sizes, statistic = "size", offspring_dist = "pois",      nsim_obs = 10000, lambda = 0.5, log = FALSE, obs_prob = 0.5)
#>       min       lq     mean   median       uq      max neval
#>  280.8126 318.0650 351.5412 339.0232 366.4869 712.5673  1000
#>  295.8321 327.4035 363.9978 348.9071 373.9444 685.0759  1000

microbenchmark(
  likelihood(
    chains = chain_sizes, statistic = "size",
    offspring_dist = "pois", nsim_obs = 1e4, lambda = 0.5, log = FALSE,
    individual = TRUE, obs_prob = 0.5
  ),
  likelihood_old(
    chains = chain_sizes, statistic = "size",
    offspring_dist = "pois", nsim_obs = 1e4, lambda = 0.5, log = FALSE,
    individual = TRUE, obs_prob = 0.5
  ),
  times = 1e3
)
#> Unit: milliseconds
#>                                                                                                                                                                         expr
#>      likelihood(chains = chain_sizes, statistic = "size", offspring_dist = "pois",      nsim_obs = 10000, lambda = 0.5, log = FALSE, individual = TRUE,      obs_prob = 0.5)
#>  likelihood_old(chains = chain_sizes, statistic = "size", offspring_dist = "pois",      nsim_obs = 10000, lambda = 0.5, log = FALSE, individual = TRUE,      obs_prob = 0.5)
#>       min       lq     mean   median       uq      max neval
#>  287.1338 308.0223 334.6865 335.8227 355.4666 544.8965  1000
#>  286.0165 306.2436 334.0689 333.8073 353.5573 664.2069  1000

Created on 2023-11-06 with reprex v2.0.2

More robust

It is often problematic to work with very small values, such as the ones produced by exponentiating a large negative number. This can lead to underflows.
This is one of the reasons why we use log likelihood, as discussed in this SO thread.

Simpler

An ifelse() has been removed.

Note

This seems like a minimal improvement so submitting this as a draft PR to get opinions. I'm fine with whatever the outcome of this PR is (closed without merging or merged).

@Bisaloo Bisaloo marked this pull request as draft November 8, 2023 10:51
R/likelihood.R Show resolved Hide resolved
@jamesmbaazam
Copy link
Member

Thanks @Bisaloo. This was raised in #65 so this PR is definitely welcome. I'll review the changes.

@jamesmbaazam jamesmbaazam marked this pull request as ready for review November 10, 2023 09:19
@codecov-commenter
Copy link

Codecov Report

Merging #105 (58146a4) into main (ee2b201) will decrease coverage by 0.24%.
The diff coverage is 83.33%.

❗ Current head 58146a4 differs from pull request most recent head 9a748f3. Consider uploading reports for the commit 9a748f3 to get more accurate results

@@            Coverage Diff             @@
##             main     #105      +/-   ##
==========================================
- Coverage   98.82%   98.58%   -0.24%     
==========================================
  Files           8        8              
  Lines         424      425       +1     
==========================================
  Hits          419      419              
- Misses          5        6       +1     
Files Coverage Δ
R/likelihood.R 98.33% <83.33%> (-1.67%) ⬇️

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

@jamesmbaazam
Copy link
Member

jamesmbaazam commented Nov 10, 2023

Thanks for the PR @Bisaloo. LGTM. I'm going to rebase and merge now.

@jamesmbaazam jamesmbaazam merged commit fd81398 into main Nov 10, 2023
7 checks passed
@jamesmbaazam jamesmbaazam deleted the sum-before-exp branch November 10, 2023 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants