kinmodel is a python package for modeling chemical kinetics, and in particular for fitting kinetic models to experimental concentration vs time data. It defines the KineticModel class, which contains a system of differential equations describing a system of chemical reactions and methods for simulating the system and fitting it to experimental concentration vs time data. It is a fairly straightforward implementation of methods in Scipy.
Note that elements of the model definition are executed within the program using Python’s exec() or eval functions! This is a security risk if models are loaded from untrusted sources.
- Python 3.6
- Numpy
- Scipy 1.2.1
- Matplotlib
- joblib
- tqdm
- PyYAML
The package can be installed as usual with pip.
At this point, primary interaction with kinmodel will probably be
through the fit_kinetics.py executable. Basic usage is of the form
fit_kinetics model_name exp_data.csv
, where model_name
defines
the kinetic model to be applied and exp_data.csv
is a csv file
containing experimental concentration vs time data for relevant species.
The csv file can contain multiple experiments that will be fitted
simultaneously for all rate constants. Experiments are separated by
title lines that must contain a title in column 1 that cannot be
interpreted as a number followed by empty cells for the remaining
columns.
The program will output txt files of the fit parameters and pdf files of
plots. At present, it produces up to two subplots so different species
can be visualized separately. There are a number of command line
options. Help is available via fit_kinetics -h
.
Data can also be simulated given a set of parameters using the
model_kinetics.py executable. The format is
model_kinetics model_name simulation_time par1 par2 par3 ...
.
Default starting concentrations associated with the model can be
overridden by including them at the end of the parameters list. Options
can be specified to control the output. Ranges for the parameters can be
set along with an optional target number of simulations (-n
): in
this case, the ranges will be divided into equal segments such that the
total number of simulations is at most the target. More information is
available from model_kinetics -h
.
Some models are included as part of the default installation, but new
ones can be defined in a simple YAML format. The models should be loaded
into either the working directory, a models
subdirectory of the
working directory, or in the user’s data directory for the program, and
must have the .yaml
file extension. They should be the only files in
these locations with that extension. The user data directory for the
program can be identified by running fit_kinetics -h
(the model
directories are listed right after the descriptions of different
options). A simple first order decay is included as a default model:
name: first_ord description: | Simple first order decay S ---> P (k) Orders: k; S, P. eq_function: | def equations(concs, t, *ks): S, P = concs k, = ks return [-k*S, +k*S, ] k_var: - name: k guess: 1 conc0_var: - name: "[S]0" guess: 100 conc0_const: - name: "[P]0" value: 0 species: - name: Starting material plot: bottom sort: 0 - name: Product plot: bottom sort: 1 calcs: - desc: "Maximum S" func: "max(c[0])" - desc: "Final S" func: "c[0][-1]"
Key fields are eq_function
, which is a Python function of
concentrations (concs), time (t), and parameters (ks) that returns the
results of the kinetic equations (first derivatives of concentrations)
defining the system. The sort
field for each species
is used to
relate to the order of columns in the input.
Models can also be defined with the IndirectKineticModel class. This allows normal KineticModel mechanisms to be used in cases where the experimental observables are a function of the underlying concentrations (e.g., oligomers where the total functional group concentration is known but individual concentrations are not). These are defined as in:
name: "DA_explicit_DA3_two_int_ss_ind" type: indirect parent_model_name: "DA_explicit_DA3_two_int_ss" description: | Indirect version of the DA_explicit_DA3_two_int_ss model, using total diacid and total anhydride concentration. species: - name: "Diacid" plot: bottom sort: 2 map: c[0] + c[3] + c[4] - name: "EDC" plot: top sort: 3 map: c[1] - name: "Urea" plot: top sort: 4 map: c[2] - name: "Linear" plot: bottom sort: 0 map: c[3] + 2*c[4] - name: "Cyclic" plot: bottom sort: 1 map: c[5]
The key here is the map
field for each species, which relate them to
concentration in the underlying model.
This indirect model uses the following as its underlying mechanism:
name: DA_explicit_DA3_two_int_ss description: | Simple model for diacid assembly with explicit consideration of linear anhydride intermediates, capped at the trimer (DA3). Separate intermediates for EDC consumption and anhydride exchange are used. DA1 + E ---> I1 (k1) I1 ---> DA1 + U (kih) I1 ---> C + U (kiC) I1 + DA1 ---> DA2 + U (kiL) I1 + DA2 ---> DA3 + U (kiL) DA2 <--> DA1 + Ip1 (k2L, km2L) C <--> Ip1 (k2C, km2C) DA3 <--> DA2 + Ip1 (k2L, km2L) DA3 <--> Ip2 + DA1 (k2L, km2L) Ip1 ---> DA1 (k3) Ip2 ---> DA2 (k3) DA2 + E ---> I2 (k1) I2 ---> DA2 + U (kih) I2 + DA1 ---> DA3 + U (kiL) Steady-state approximations with K1 = kih/kiL, EM1 = kiC/kiL, K2 = k3/km2L, and EM2 = km2C/km2L. Orders: k1, K1, EM1, K2, EM2, k2C, k2L; DA1, E, U, DA2, DA3, C. eq_function: | def equations(concs, t, *ks): DA1, E, U, DA2, DA3, C = concs k1, K1, EM1, K2, EM2, k2C, k2L = ks # Return the equations for concs return [ (DA3*k2L + k2L*DA2 - k1*DA1*E - (DA1*(DA3*k2L + k2C*C + k2L*DA2))/(EM2 + K2 + DA1 + DA2) + (K2*(DA3*k2L + k2C*C + k2L*DA2))/(EM2 + K2 + DA1 + DA2) - (k1*DA1**2*E)/(EM1 + K1 + DA1 + DA2) - (DA3*k2L*DA1)/(K2 + DA1) + (K1*k1*DA1*E)/(EM1 + K1 + DA1 + DA2) - (k1*DA1*DA2*E)/(K1 + DA1)), - k1*DA1*E - k1*DA2*E, + k1*DA1*E + k1*DA2*E, (DA3*k2L - k2L*DA2 - k1*DA2*E + (DA1*(DA3*k2L + k2C*C + k2L*DA2))/(EM2 + K2 + DA1 + DA2) - (DA2*(DA3*k2L + k2C*C + k2L*DA2))/(EM2 + K2 + DA1 + DA2) + (DA3*K2*k2L)/(K2 + DA1) + (k1*DA1**2*E)/(EM1 + K1 + DA1 + DA2) + (K1*k1*DA2*E)/(K1 + DA1) - (k1*DA1*DA2*E)/(EM1 + K1 + DA1 + DA2)), ((DA2*(DA3*k2L + k2C*C + k2L*DA2))/(EM2 + K2 + DA1 + DA2) - 2*DA3*k2L + (DA3*k2L*DA1)/(K2 + DA1) + (k1*DA1*DA2*E)/(EM1 + K1 + DA1 + DA2) + (k1*DA1*DA2*E)/(K1 + DA1)), ((EM2*(DA3*k2L + k2C*C + k2L*DA2))/(EM2 + K2 + DA1 + DA2) - k2C*C + (EM1*k1*DA1*E)/(EM1 + K1 + DA1 + DA2)), ] k_var: - name: "k1" guess: 1 - name: "K1" guess: 40 - name: "EM1" guess: 50 - name: "K2" guess: 45 - name: "EM2" guess: 100 - name: "k2C" guess: 1e-2 - name: "k2L" guess: 2e-2 k_const: conc0_var: - name: "[DA1]0" guess: 25 - name: "[EDC]0" guess: 50 conc0_const: - name: "[U]0" value: 0 - name: "[DA2]0" value: 0 - name: "[DA3]0" value: 0 - name: "[C]0" value: 0 species: - name: "DA1" plot: bottom - name: "EDC" plot: top - name: "Urea" plot: top - name: "DA2" plot: bottom - name: "DA3" plot: bottom - name: "Cy" plot: bottom integrals: - desc: "(k2L*EM2*DA3)/(EM2+K2+DA1+DA2)" func: "((k[6]*k[4]*c[4]) / (k[4]+k[3]+c[0]+c[3]))" - desc: "(k2C*EM2*C)/(EM2+K2+DA1+DA2)" func: "((k[5]*k[4]*c[5]) / (k[4]+k[3]+c[0]+c[3]))" - desc: "(k2L*EM2*DA2)/(EM2+K2+DA1+DA2)" func: "((k[6]*k[4]*c[3]) / (k[4]+k[3]+c[0]+c[3]))" - desc: "k2C*C" func: "k[5]*c[5]" - desc: "(EM1*k1*DA1*E)/(EM1+K1+DA1+DA2)" func: "((k[2]*k[0]*c[0]*c[1]) / (k[2]+k[1]+c[0]+c[3]))" calcs: - desc: "C produced directly from EDC ∫(EM1*k1*DA1*E)/(EM1+K1+DA1+DA2)dt" func: "i[4]" - desc: "C yield directly from EDC ∫(EM1*k1*DA1*E)/(EM1+K1+DA1+DA2)dt/E0" func: "i[4] / c[1][0]" - desc: "Total C hydrolysis ∫(k2C*C)dt" func: "i[3]" - desc: "C produced from DA2 exchange ∫(k2L*EM2*DA2)/(EM2+K2+DA1+DA2)dt" func: "i[2]" - desc: "C produced from DA3 exchange ∫(k2L*EM2*DA3)/(EM2+K2+DA1+DA2)dt" func: "i[0]" - desc: "C produced from C after decomp ∫(k2C*EM2*C)/(EM2+K2+DA1+DA2)dt" func: "i[1]" lifetime_concs: - 3 - 4 - 5 rectime_concs: - 0
A useful feature of kinmodel is that the KineticModel objects can
contain calculations that will be performed on the results of the
regression. Various quantities, like maximum concentrations, can be
calculated. There are two aspects of this. The integrals
field
defines equations that will be integrated across the concentration vs
time data; desc
is used to describe them in the output and func
is the function that will be integrated of parameters (k) and
concentrations (c), indexed in the order they are listed elsewhere
(starting from 0). calcs
are functions of concentration (c), time
(t), parameters (k), and integrals (i) that are calculated at the end
for a given run.