v0.3.4 Updates for glmmrBase 0.7.1

Merge branch 'main' of https://github.com/samuel-watson/glmmrOptim # Conflicts: # DESCRIPTION # R/R6designspace.R
samuel-watson · Mar 1, 2024 · 6b10976 · 6b10976
2 parents 4b70c0c + 4d9b1fe
commit 6b10976
Show file tree

Hide file tree

Showing 2 changed files with 32 additions and 37 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -23,7 +23,7 @@ LinkingTo:
     Rcpp (>= 1.0.7),
     RcppEigen,
     RcppProgress,
-    glmmrBase (>= 0.4.5),
+    glmmrBase (>= 0.4.6),
     SparseChol (>= 0.2.1),
     BH,
     rminqa (>= 0.2.2)

diff --git a/README.md b/README.md
@@ -1,9 +1,18 @@
 [![cran version](http://www.r-pkg.org/badges/version/glmmrOptim)](https://cran.r-project.org/web/packages/glmmrOptim)
 
 # glmmrOptim
-(This text relates to version 0.1.11, it will be updated to version 0.2.x shortly)
+(This text relates to version 0.3.3)
 R package for approximate optimal experimental designs using generalised linear mixed models (GLMM) and combinatorial optimisation methods,
-built on the [glmmrBase](https://github.com/samuel-watson/glmmrBase) package. A discussion of the methods in this package can be found in [Watson et al (2022)](https://arxiv.org/abs/2207.09183).
+built on the [glmmrBase](https://github.com/samuel-watson/glmmrBase) package. A discussion of the methods in this package can be found in [Watson et al (2023)]([https://arxiv.org/abs/2207.09183](https://journals.sagepub.com/doi/10.1177/09622802231202379)).
+
+## Installation and building
+The package is available on CRAN. A pre-compiled binary is also available with each release on this page. The package requires `glmmrBase`, it is recommended to build both `glmmrBase` and this package from source with the flags below, which can dramatically increase performance.
+
+### Building from source
+It is strongly recommended to build from source with the flags `-fno-math-errno -O3 -g`, this will cut the time to run many functions by as much as 90%. One way to do this is to set CPP_FLAGS in `~/.R/Makevars`. Another alternative is to download the package source `.tar.gz` file and run from the command line 
+```
+R CMD INSTALL --configure-args="CPPFLAGS=-fno-math-errno -O3 -g" glmmrOptim_0.3.3.tar.gz
+```
 
 ## Model specification
 For model specification see readme of [glmmrBase](https://github.com/samuel-watson/glmmrBase). The `glmmrOptim` package adds the `DesignSpace` class. An instance 
@@ -18,51 +27,37 @@ The algorithm searches for a c-optimal design of size m from the design space us
 The objective function is
 
 $$
-C^TM^{-1}C
+c^TM^{-1}c
 $$ 
 
-where $M$ is the information matrix and $C$ is a vector. Typically $C$ will be a vector of zeros with a single 1 in the position of the parameter of interest. 
-For example, if the columns of $X$ in the design are an intercept, the treatment indicator, and then time period indicators, the vector $C$ may be `c(0,1,0,0,...)`, 
-such that the objective function is the variance of that parameter. If there are multiple designs in the design space, the $C$ vectors do not have to be the same 
+where $M$ is the information matrix and $c$ is a vector. Typically $c$ will be a vector of zeros with a single 1 in the position of the parameter of interest. 
+For example, if the columns of $X$ in the design are an intercept, the treatment indicator, and then time period indicators, the vector $c$ may be `c(0,1,0,0,...)`, 
+such that the objective function is the variance of that parameter. If there are multiple designs in the design space, the $c$ vectors do not have to be the same 
 as the columns of X in each design might differ, in which case a list of vectors can be provided.
 
-If the experimental conditions are correlated with one another, then one of the combinatorial algorithms is used to find an optimal design. 
-If the experimental conditional are uncorrelated (but there is correlation between observations 
-within the same experimental condition) then optionally a fast algorithm can be used to approximate the optimal design using a second-order cone program. 
-The approximate algorithm will return weights for each unique experimental condition representing the 
-"proportion of effort" to spend on each design condition. There are different ways to translate these weights into integer values. 
-Use of the approximate optimal design algorithm can be disabled with the option `force_hill=TRUE`.
+There are a variety of algorithms available:
+- For design spaces with correlated experimental units, one can use either combinatorial algorithms: `algo=1` local search, `algo=2` greedy search, or `algo=3`; or an optimal mixed model weights algorithm, the "Girling" algorithm with `algo="girling"`.
+- For design spaces with uncorrelated experimental units by default the optimal experimental unit weights will be calculated using a second-order cone program. To instead use a combinatorial algorithm set `use_combin=TRUE`.
 
 In some cases the optimal design will not be full rank with respect to the design matrix $X$ of the design space. This will result in a non-positive definite 
 information matrix, and an error. The program will indicate which columns of $X$ are likely "empty" in the optimal design. The user can then optionally remove 
 these columns in the algorithm using the `rm_cols` argument, which will delete the specified columns and linked observations before starting the algorithm. 
 
 The algorithm will also identify robust optimal designs if there are multiple designs in the design space. A weighted average of objective functions is used, 
-where the weights are specified by the `weights` field in the design space with default $1/N$. 
-The weights may represent the prior probability or plausibility of each design, for example. 
+where the weights are specified by the `weights` field in the design space with default $1/N$. The weights may represent the prior probability or plausibility of each design, for example.  The objective function can be either a linear combination of variances, or a linear combination of log variances (`robust_log=TRUE`).
 
 An example of model specification and optimal design search is below.
 ```
-R> df <- nelder(~(cl(6)*t(5)) > ind(5))
-R> df$int <- 0
-R> df[df$t >= df$cl, 'int'] <- 1
-R> mf1 <- MeanFunction$new(
-R>   formula = ~ factor(t) + int - 1,
-R>   data=df,
-R>   parameters = c(rep(0,5),0.6),
-R>   family =gaussian()
-R> )
-R> cov1 <- Covariance$new(
-R>   data = df,
-R>   formula = ~ (1|gr(cl)),
-R>   parameters = c(0.25)
-R> )
-R> des <- Model$new(
-R>   covariance = cov1,
-R>   mean.function = mf1,
-R>   var_par = 1
-R> )
-R> ds <- DesignSpace$new(des)
-R> #find the optimal design of size 30 individuals
-R> opt <- ds$optimal(30,C=c(rep(0,5),1))
+df <- nelder(formula(~ (cl(7) * t(6)) > ind(1)))
+df$int <- 0
+df[df$t >= df$cl,'int'] <- 1
+des <- Model$new(formula = ~factor(t)  + int - 1 + (1|gr(cl)) + (1|gr(cl,t)),
+                 covariance = c(0.04,0.01),
+                 mean = rep(0,7),
+                 var_par = sqrt(0.95),
+                 data = df,
+                 family=gaussian())
+ds <- DesignSpace$new(des)
+w1 <- ds$optimal(100,C = list(c(rep(0,6),1)),verbose = TRUE,algo="girling")
 ```
+The design space supports any model specified in the `glmmrBase` package. Where there are non-linear functions of covariates in the fixed effects, a first-order approximation is used.