Adaptive Boltzmann machine learning for Potts models of biological data

Description

This is an implementation of the Boltzmann machine learning to infer several maximum-entropy statistical models of Potts or Ising variables given a set of observables. More precisely, it infers the couplings and the fields of a set of generalized Direct Coupling Analysis (DCA) models given a Multiple Sequence Alignment (MSA) of protein or RNA sequences. It is also possible to infer an Ising model from a set of spin configurations. The learning is performed via a gradient ascent of the likelihood of the data in which the model observables are computed via a Markov Chain Monte Carlo (MCMC) sampling.

First release

The first implementation has been described in adabmDCA: Adaptive Boltzmann machine learning for biological sequences. Please cite this paper if you use (even partially) this code.

adabmDCA has been used in:

Aligning biological sequences by exploiting residue conservation and coevolution - A. Muntoni, A. Pagnani, M. Weigt, F. Zamponi (on Phys. Rev. E) to learn the Potts model and pseudo Hidden Markov model (see the option -D in Advanced options/Available maximum entropy models) of the studied protein and RNA seed alignment;
Sparse generative modeling of protein-sequence families - P. Barrat-Charlaix, A. Muntoni, K. Shimagaki, M. Weigt, F. Zamponi (on Phys. Rev. E) to learn the dense model and to perform the information-based pruning of the coupling parameters (see Advanced options/Pruning/activating the coupling).

Last release

Note that the last version of adabmDCA uses a sligthly different set of input flags compared to those presented in adabmDCA. Two learning protocols have been added. It is now possible to:

Integrate experimental fitnesses in the final model (see Improving landscape inference by integrating heterogeneous data in the inverse Ising problem)
Perform an edge-activation scheme to train a sparse model. See Towards parsimonious generative modeling of RNA families

Installation

This code is written in C/C++ language. To properly install adabmDCA, run

make

on adabmDCA/src folder. It suffices a g++ compiler.

Usage

All the possible routines (model choices, decimation/activation procedure, input and output files) implemented in adabmDCA can be shown typing

./adabmDCA -h

For a detailed description see the documentation file.

Name		Name	Last commit message	Last commit date
Latest commit History 129 Commits
docs		docs
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adaptive Boltzmann machine learning for Potts models of biological data

Description

First release

Last release

Installation

Usage

About

Releases 1

Packages

Contributors 2

Languages

License

anna-pa-m/adabmDCA

Folders and files

Latest commit

History

Repository files navigation

Adaptive Boltzmann machine learning for Potts models of biological data

Description

First release

Last release

Installation

Usage

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages