Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix doc rendering issues #74

Merged
merged 6 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: monty
Title: Monte Carlo Models
Version: 0.2.6
Version: 0.2.7
Authors@R: c(person("Rich", "FitzJohn", role = c("aut", "cre"),
email = "[email protected]"),
person("Wes", "Hinsley", role = "aut"),
Expand Down
4 changes: 4 additions & 0 deletions R/sample-manual.R
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,8 @@ monty_sample_manual_run <- function(chain_id, path, progress = NULL) {
##' [monty_sample_manual_prepare] and [monty_sample_manual_run]. If
##' any chain has not completed, we will error.
##'
##' @title Collect manually run samples
##'
##' @inheritParams monty_sample_manual_run
##'
##' @param samples Samples from the parent run. You need to provide
Expand Down Expand Up @@ -273,6 +275,8 @@ sample_manual_info_chain <- function(complete) {
##' [monty_sample_manual_prepare] is to [monty_sample]. The original
##' set of samples do not need to have been run manually.
##'
##' @title Prepare to continue sampling with manual scheduling
##'
##' @param save_samples Control over saving samples into the inputs.
##' The choices here are `hash` (the default) where we save a hash
##' and validate that in [monty_sample_manual_collect], `value`
Expand Down
12 changes: 12 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,15 @@ reference:
- monty_rng_distributed_pointer
- monty_rng_distributed_state
- monty_rng_pointer

articles:
- title: Basics
navbar: Basics
contents:
- dsl
- samples
- title: Details
navbar: Details
contents:
- dsl-errors
- samplers
4 changes: 1 addition & 3 deletions man/monty_sample_manual_collect.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 1 addition & 4 deletions man/monty_sample_manual_prepare_continue.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion vignettes/monty.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ likelihood <- monty_model_function(fn, fixed = list(data = data))
likelihood
```

The prior we'll make much nicer to work with in the future, but here we construct the density by hand as a sum of normally distributed priors on `a` and `b`, and a weak uniform prior on `sigma`.
We construct a prior for this model using the monty DSL (`vignette("dsl")`), using normally distributed priors on `a` and `b`, and a weak uniform prior on `sigma`.

```{r}
prior <- monty_dsl({
Expand Down
76 changes: 31 additions & 45 deletions vignettes/samplers.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,53 +25,23 @@ library(monty)

## The bendy banana

This example shows HMC outperforming a random walk on a two dimensional banana-shaped function. Our model takes two parameters `alpha` and `beta`, and is based on two successive simple draws, with the one conditional on the other:
This example shows HMC outperforming a random walk on a two dimensional banana-shaped function. Our model takes two parameters `alpha` and `beta`, and is based on two successive simple draws, with the one conditional on the other, so $\beta \sim Normal(1,0)$ and $\alpha \sim Normal(\beta^2, \sigma)$, with $\sigma$ the standard deviation of the conditional draw.

\[
\beta \sim Normal(1,0) \\
\alpha \sim Normal(\beta^2, \sigma)
\]

with $\sigma$ the standard deviation of the conditional draw.

We'll use R's `dnorm` for the likelihood calculations and then differentiate the density by hand (using the formula of the gaussian density) to derive the gradient by hand. We skip the details here as this will be automated in an upcoming version of the package.
We include this example within the package; here we create a model with $\sigma = 0.5$

```{r}
banana_model <- function(sigma = 0.5) {
monty_model(list(
parameters = c("alpha", "beta"),
direct_sample = function(rng) {
beta <- rng$random_normal(1)
alpha <- rng$normal(1, beta^2, sigma)
c(alpha, beta)
},
density = function(x) {
alpha <- x[[1]]
beta <- x[[2]]
dnorm(beta, log = TRUE) + dnorm((alpha - beta^2) / sigma, log = TRUE)
},
gradient = function(x) {
alpha <- x[[1]]
beta <- x[[2]]
c((beta^2 - alpha) / sigma^2,
-beta + 2 * beta * (alpha - beta^2) / sigma^2)
},
domain = rbind(c(-Inf, Inf), c(-Inf, Inf))))
}
m <- monty_example("banana", sigma = 0.5)
m
```

Let's create a model with $\sigma = 0.5$

```{r create_banana_model}
m <- banana_model(0.5)
```

We can plot a greyscale visualisation of its density by computing the density over a grid. Normally this is not possible of course:
We can plot a visualisation of its density by computing the density over a grid. Normally this is not possible of course:

```{r}
a <- seq(-2, 6, length.out = 1000)
b <- seq(-2, 2, length.out = 1000)
z <- outer(a, b, function(a, b) dnorm(b) * dnorm((a - b^2) / 0.5))
z <- outer(a, b, function(alpha, beta) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above - it's not strictly "greyscale" - more sort of shades of roasted banana from yellow to gold. But it probably doesn't matter...

exp(monty_model_density(m, rbind(alpha, beta)))
})
image(a, b, z, xlab = "alpha", ylab = "beta")
```

Expand All @@ -84,7 +54,7 @@ image(a, b, z, xlab = "alpha", ylab = "beta")
points(s[1, ], s[2, ], pch = 19, col = "#00000055")
```

It is also possible to compute the 95% confidence interval of the distribution using the relationship between the standard bivariate normal distribution and the banana shaped distribution as defined above. We can check than roughly 10 samples (out of 200) are out of this 95% CI contour.
It is also possible to compute the 95% confidence interval of the distribution using the relationship between the standard bivariate normal distribution and the banana shaped distribution as defined above. We can check that roughly 10 samples (out of 200) are out of this 95% CI contour.

```{r}
theta <- seq(0, 2 * pi, length.out = 10000)
Expand All @@ -104,24 +74,40 @@ points(s[1, ], s[2, ], pch = 19, col = "#00000055")

It is not generally possible to directly sample from a density (otherwise MCMC and similar methods would not exist!). In these cases we need to use a sampler based on the density and if available possibly the gradient of the density.

We can start with a basic random-walk sampler
We can start with a basic random-walk sampler:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above on line 57, a typo - We can check than -> that


```{r RW_sampling}
sampler_rw <- monty_sampler_random_walk(vcv = diag(2) * 0.01)
res_rw <- monty_sample(m, sampler_rw, 2000)
plot(res_rw$pars, pch = 19, col = "#ff222277")
res_rw <- monty_sample(m, sampler_rw, 1000)
plot(t(drop(res_rw$pars)), type = "l", col = "#0000ff66",
xlim = range(a), ylim = range(b))
lines(z95[, 1], z95[, 2])
```

As we can see this is not great. We can probably improve the samples here by finding a better variance covariance matrix (VCV), but a single VCV will not hold well over the whole surface because it is not very similar to a multivariate normal (that is, the appropriate VCV will change depending on the position in parameter space)
As we can see this is not great, exhibiting strong random walk behaviour as it slowly explores the surface (this is over 1,000 steps). Another way to view this is the parameters varying over steps:

```
matplot(t(drop(res_rw$pars)), lty = 1, type = "l", col = c(2, 4),
xlab = "Step", ylab = "Value")
```

We can probably improve the samples here by finding a better variance covariance matrix (VCV), but a single VCV will not hold well over the whole surface because it is not very similar to a multivariate normal (that is, the appropriate VCV will change depending on the position in parameter space)

Let's try the Hamiltonian Monte Carlo (HMC) sampler, which uses the gradient to move efficiently in parameter space:

```{r HMC_sampling}
sampler_hmc <- monty_sampler_hmc(epsilon = 0.1, n_integration_steps = 10)
res_hmc <- monty_sample(m, sampler_hmc, 2000)
plot(res_hmc$pars, pch = 19, col = "#ff222277")
res_hmc <- monty_sample(m, sampler_hmc, 1000)
plot(t(drop(res_hmc$pars)), type = "l", col = "#0000ff33",
xlim = range(a), ylim = range(b))
lines(z95[, 1], z95[, 2])
```

or viewed over steps:

```{r}
matplot(t(drop(res_hmc$pars)), lty = 1, type = "l", col = c(2, 4),
xlab = "Step", ylab = "Value")
```

Clearly better!
Loading