Adapted from Navarro-Brul et al., React. Chem. Eng., 2022
The Design of Experiments is the theory of conceiving the optimal set of trials for model-testing experimentation. DOE packages may have 4 different capabilities:
- Generation of the design, e.g., generating factorial designs, latin hypercube, etc., upon the user request
- Analysis of the design, e.g., the ability of comparing the sampling optimality of different designs for the model hypothesis, evaluating the aliasing of factors, etc.
- Analysis of the response, e.g., the ability of testing the model and fitting the coefficients. Most of open-source DOE packages lack this ability, relying on well-established statistic packages such as statsmodels and scikit-learn.
- Design augmentation, which is the typical pipeline of Active Learning (or Bayesian Optimization), using the response of the early trials to suggest a new set of trials that are a promising compromise between exploitation and exploration toward an optimum goal.
In the following we list a number of open-source packages, that focus on the generation and the analisys of designs.
Don't hesitate to open an issue to report any package (with a reasonable users base) that is missing from this list
A collection of "classical" design of experiments.
We refer to a fork called pyDOE2
, which is just adding the GSD method (i.e., a 3+levels fractional factorial) to pyDOE
.
- Factorial Designs
- General Full-Factorial (
fullfact
) - 2-level Full-Factorial (
ff2n
) - 2-level Fractional Factorial (
fracfact
) - Plackett-Burman (
pbdesign
) - Generalized Subset Designs (
gsd
)
- General Full-Factorial (
- Response-Surface Designs
- Box-Behnken (
bbdesign
) - Central-Composite (
ccdesign
)
- Box-Behnken (
- Randomized Designs
- Latin-Hypercube (
lhs
)
- Latin-Hypercube (
Another collection of "classical" design of experiments.
- Full factorial:
build.full_fact()
- 2-level fractional factorial:
build.frac_fact_res()
- Plackett-Burman:
build.plackett_burman()
- Sukharev grid:
build.sukharev()
- Box-Behnken:
build.box_behnken()
- Box-Wilson (Central-composite)
- with center-faced option:
build.central_composite()
withface='ccf'
option - with center-inscribed option:
build.central_composite()
withface='cci'
option - with center-circumscribed option:
build.central_composite()
withface='ccc'
option
- with center-faced option:
- Latin hypercube (simple):
build.lhs()
- Latin hypercube (space-filling):
build.space_filling_lhs()
- Random k-means cluster:
build.random_k_means()
- Maximin reconstruction:
build.maximin()
- Halton sequence based:
build.halton()
- Uniform random matrix:
build.uniform_random()
Yet another collection of "classical" design of experiments.
- Fractional Factorial:
build_factorial(factor_count, run_count)
- Full Factorial:
build_full_factorial(factor_count)
- Central Composite:
build_ccd(factor_count, alpha='rotatable', center_points=1)
- Mixture Simplex Lattice:
build_simplex_lattice(factor_count, model_order=<ModelOrder.quadratic: 2>)
- Mixture Simplex Centroid:
build_simplex_centroid(factor_count)
- Optimal Designs:
build_optimal(factor_count, **kwargs)
Analysis of the design:
- Statistical Power:
f_power(model, design, effect_size, alpha)
- Alias list:
alias_list(model, design)
Collection of algorithms for uniform sampling, and related topics.
cube
- Uniform sampling from the unit hypercubecube.stratify_conventional
: stratification of the unit hypercubestratify_generalized
: generalized stratification of the unit hypercubecube.latin_design
: generate a random latin hypercube design matrixcube.improved_latin_design
: generate an ‘improved’ latin hypercube design matrixcube.rank1_design
: design matrix for a rank-1 latticecube.sample_halton
: generate a Halton point setcube.sample_maximin
: maximize the minimal distance in the unit hypercube with extensionscube.sample_k_means
: in its default setup, this algorithm converges to a centroidal Voronoi tesselation of the unit hypercubecube.grid
: create conventional grid in the unit hypercube
simplex
- Uniform sampling on the unit simplexpolytope
- Uniform sampling from convex polytopessubset
- Select diverse subsetssubset.psa_partition
: partition the data set into the given number of clusters with the part-and-select algorithmsubset.psa_select
: select representatives points with the part-and-select algorithmsubset.select_greedy_maximin
: greedily select a subset according to maximin criterionsubset.select_greedy_maxisum
: greedily select a subset according to maxisum criterion.
Analysis of the design:
indicator.solow_polasky_diversity
: Solow-Polasky diversityindicator.weitzman_diversity
: Weitzman diversityindicator.sum_of_dists
: square root of the sum of all pairwise distancesindicator.average_inverse_dist
: average inverse distanceindicator.separation_dist
: minimal pairwise distanceindicator.wmh_index
: quality index of Wahl, Mercadier, and Helbertindicator.sum_of_nn_dists
: sum of nearest-neighbor distancesindicator.unanchored_L2_discrepancy
: unanchored L2 discrepancy
Definitive Screening Design - GitHub
Implementation of the DSD in python: a small design aimed to screen all factors for second order models.
dsd.generate(n_num, n_cat, factors_dict=None, method='dsd', min_13=True, n_fake_factors=0)
Analysis of the design:
dsd.analysis.get_map_of_correlations(X, effects)
Package focused on the Latin Hypercube Design (LHD), to generate and analyze several variants of this design.
- Classical latin hypercube:
pyLHD.LatinHypercube(size, seed, scramble)
Analysis of the design:
Average Absolute Correlation, Maximum Absolute Correlation, Maximum Projection Criterion (Joseph 2015), Coverage measure, Inter-site Distance, Discrepancy, MaxiMin, Mesh Ratio, Phi_p Criterion.
BoFire is a Bayesian Optimization Framework Intended for Real Experiments. It contains nice features to generate a DoE when starting from scratch.
- D-, A-, G-, E-, K- optimization in a constrained design space
- Space filling in a constrained design space
Analysis of the design:
bofire.utils.doe.get_confounding_matrix()
The Orthogonal Array package contains functionality to generate and analyse orthogonal arrays, optimal designs and conference designs.
- Generate (
oapackage.arraydata_t()
) and extend (oapackage.extend_array()
) orthogonal arrays - Conference designs (
oapackage.conference_t()
) - D-Efficient optimized design (
oapackage.Doptimize()
)
Analysis of the design:
- D-, Ds-, A-, E- efficiency of the design (
.Defficiency()
,.DsEfficiency()
,.Aefficiency()
,.Eefficiency()
)