Skip to content

Commit

Permalink
v0.3.4 Updates for glmmrBase 0.7.1
Browse files Browse the repository at this point in the history
Merge branch 'main' of https://github.com/samuel-watson/glmmrOptim

# Conflicts:
#	DESCRIPTION
#	R/R6designspace.R
  • Loading branch information
samuel-watson committed Mar 1, 2024
2 parents 4b70c0c + 4d9b1fe commit 6b10976
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 37 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ LinkingTo:
Rcpp (>= 1.0.7),
RcppEigen,
RcppProgress,
glmmrBase (>= 0.4.5),
glmmrBase (>= 0.4.6),
SparseChol (>= 0.2.1),
BH,
rminqa (>= 0.2.2)
Expand Down
67 changes: 31 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
[![cran version](http://www.r-pkg.org/badges/version/glmmrOptim)](https://cran.r-project.org/web/packages/glmmrOptim)

# glmmrOptim
(This text relates to version 0.1.11, it will be updated to version 0.2.x shortly)
(This text relates to version 0.3.3)
R package for approximate optimal experimental designs using generalised linear mixed models (GLMM) and combinatorial optimisation methods,
built on the [glmmrBase](https://github.com/samuel-watson/glmmrBase) package. A discussion of the methods in this package can be found in [Watson et al (2022)](https://arxiv.org/abs/2207.09183).
built on the [glmmrBase](https://github.com/samuel-watson/glmmrBase) package. A discussion of the methods in this package can be found in [Watson et al (2023)]([https://arxiv.org/abs/2207.09183](https://journals.sagepub.com/doi/10.1177/09622802231202379)).

## Installation and building
The package is available on CRAN. A pre-compiled binary is also available with each release on this page. The package requires `glmmrBase`, it is recommended to build both `glmmrBase` and this package from source with the flags below, which can dramatically increase performance.

### Building from source
It is strongly recommended to build from source with the flags `-fno-math-errno -O3 -g`, this will cut the time to run many functions by as much as 90%. One way to do this is to set CPP_FLAGS in `~/.R/Makevars`. Another alternative is to download the package source `.tar.gz` file and run from the command line
```
R CMD INSTALL --configure-args="CPPFLAGS=-fno-math-errno -O3 -g" glmmrOptim_0.3.3.tar.gz
```

## Model specification
For model specification see readme of [glmmrBase](https://github.com/samuel-watson/glmmrBase). The `glmmrOptim` package adds the `DesignSpace` class. An instance
Expand All @@ -18,51 +27,37 @@ The algorithm searches for a c-optimal design of size m from the design space us
The objective function is

$$
C^TM^{-1}C
c^TM^{-1}c
$$

where $M$ is the information matrix and $C$ is a vector. Typically $C$ will be a vector of zeros with a single 1 in the position of the parameter of interest.
For example, if the columns of $X$ in the design are an intercept, the treatment indicator, and then time period indicators, the vector $C$ may be `c(0,1,0,0,...)`,
such that the objective function is the variance of that parameter. If there are multiple designs in the design space, the $C$ vectors do not have to be the same
where $M$ is the information matrix and $c$ is a vector. Typically $c$ will be a vector of zeros with a single 1 in the position of the parameter of interest.
For example, if the columns of $X$ in the design are an intercept, the treatment indicator, and then time period indicators, the vector $c$ may be `c(0,1,0,0,...)`,
such that the objective function is the variance of that parameter. If there are multiple designs in the design space, the $c$ vectors do not have to be the same
as the columns of X in each design might differ, in which case a list of vectors can be provided.

If the experimental conditions are correlated with one another, then one of the combinatorial algorithms is used to find an optimal design.
If the experimental conditional are uncorrelated (but there is correlation between observations
within the same experimental condition) then optionally a fast algorithm can be used to approximate the optimal design using a second-order cone program.
The approximate algorithm will return weights for each unique experimental condition representing the
"proportion of effort" to spend on each design condition. There are different ways to translate these weights into integer values.
Use of the approximate optimal design algorithm can be disabled with the option `force_hill=TRUE`.
There are a variety of algorithms available:
- For design spaces with correlated experimental units, one can use either combinatorial algorithms: `algo=1` local search, `algo=2` greedy search, or `algo=3`; or an optimal mixed model weights algorithm, the "Girling" algorithm with `algo="girling"`.
- For design spaces with uncorrelated experimental units by default the optimal experimental unit weights will be calculated using a second-order cone program. To instead use a combinatorial algorithm set `use_combin=TRUE`.

In some cases the optimal design will not be full rank with respect to the design matrix $X$ of the design space. This will result in a non-positive definite
information matrix, and an error. The program will indicate which columns of $X$ are likely "empty" in the optimal design. The user can then optionally remove
these columns in the algorithm using the `rm_cols` argument, which will delete the specified columns and linked observations before starting the algorithm.

The algorithm will also identify robust optimal designs if there are multiple designs in the design space. A weighted average of objective functions is used,
where the weights are specified by the `weights` field in the design space with default $1/N$.
The weights may represent the prior probability or plausibility of each design, for example.
where the weights are specified by the `weights` field in the design space with default $1/N$. The weights may represent the prior probability or plausibility of each design, for example. The objective function can be either a linear combination of variances, or a linear combination of log variances (`robust_log=TRUE`).

An example of model specification and optimal design search is below.
```
R> df <- nelder(~(cl(6)*t(5)) > ind(5))
R> df$int <- 0
R> df[df$t >= df$cl, 'int'] <- 1
R> mf1 <- MeanFunction$new(
R> formula = ~ factor(t) + int - 1,
R> data=df,
R> parameters = c(rep(0,5),0.6),
R> family =gaussian()
R> )
R> cov1 <- Covariance$new(
R> data = df,
R> formula = ~ (1|gr(cl)),
R> parameters = c(0.25)
R> )
R> des <- Model$new(
R> covariance = cov1,
R> mean.function = mf1,
R> var_par = 1
R> )
R> ds <- DesignSpace$new(des)
R> #find the optimal design of size 30 individuals
R> opt <- ds$optimal(30,C=c(rep(0,5),1))
df <- nelder(formula(~ (cl(7) * t(6)) > ind(1)))
df$int <- 0
df[df$t >= df$cl,'int'] <- 1
des <- Model$new(formula = ~factor(t) + int - 1 + (1|gr(cl)) + (1|gr(cl,t)),
covariance = c(0.04,0.01),
mean = rep(0,7),
var_par = sqrt(0.95),
data = df,
family=gaussian())
ds <- DesignSpace$new(des)
w1 <- ds$optimal(100,C = list(c(rep(0,6),1)),verbose = TRUE,algo="girling")
```
The design space supports any model specified in the `glmmrBase` package. Where there are non-linear functions of covariates in the fixed effects, a first-order approximation is used.

0 comments on commit 6b10976

Please sign in to comment.