Augmented_CGCNN

This software package implements a data augmentation method to enable machine learning models to predict the relaxed formation energy of unrelaxed structure inputs.

The major function of the package is to augment the training and validation data used to train a machine learning model.

The following paper describes the augmentation technique: Data-Augmentation for Graph Neural Network Learning of the Relaxed Energies of Unrelaxed Structures

The software used to train the CGCNN and the CGCNN-HD are well documented in the respective respitory.

Files and directories

cgcnn/data.py and cgcnn/model.py contain dataloaders and the CGCNN, respectively.

cgcnn/model_train.py contains the class MPCrystalGraphConvNet which places each convolutional layer on a seperate GPU. This helps alleviate memory issues when a batch contains multiple structures with more than 64 atoms.

pre-trained/ contains the files needed to load all models used in the study.

raw_dft/ contains raw DFT files. POTCAR, PROCAR, OUTCAR and vasprun.xml files have been removed.

test_data/unrelaxed/ and test_data/relaxed/ contain Test-unrelaxed and Test-relaxed. In each directory the atom_init.json file contains embedding information for the CGCNN. The id_prop.csv is the file needed for the CGCNN to load the data. The first column is the structure id the second column is the relaxed formation energy per atom of the structure.

augment_mp.py queries MaterialsProject and writes the training and validation datasets.

dists.pkl contains the ditribution used to perturb the structures.

predict.py predicts the formation energy per atom of both Test-relaxed and Test-unrelaxed using all four models from the study.

Prerequisites

This package requires:

If you are new to Python, the easiest way of installing the prerequisites is via conda. After installing conda, run the following command to create a new environment named augmented_cgcnn and install all prerequisites:

conda upgrade conda
conda create -n augmented_cgcnn python=3 scikit-learn pytorch torchvision pymatgen -c pytorch -c conda-forge

This creates a conda environment for running CGCNN and perturbing structures. Activate the environment by:

conda activate augmented_cgcnn

How to augment data from MaterialsProject

to generate the augment training and validation set; add your materialsproject API key to augment_mp.py and run:

mkdir train_data
mkdir validation_data
python augment_mp.py

This will perturb every structure in the MP database and write 80% of the data (which consists of one perturbed structure for every relaxed structure) to train_data/ and 20% to validation_data/.

How to make a prediction using the pretrained models

to make predictions on Test-relaxed and Test-unrelaxed, run:

python predict.py

This will print the prediction errors for the four models and write the predictions and target values to a csv file. The first column is the DFT value, the second column is the predicted value.

Paper

Our paper can be found here

Citation

If you use the code in your work, please cite:

 @article{gibson_hire_hennig_2022, 
 title={Data-augmentation for graph neural network learning of the relaxed energies of unrelaxed structures}, 
 volume={8}, DOI={10.1038/s41524-022-00891-8}, 
 number={1}, 
 journal={npj Computational Materials}, 
 author={Gibson, Jason and Hire, Ajinkya and Hennig, Richard G.}, 
 year={2022}}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
__pycache__		__pycache__
cgcnn		cgcnn
pre_trained		pre_trained
raw_dft		raw_dft
test_data		test_data
LICENSE		LICENSE
README.md		README.md
augment_mp.py		augment_mp.py
dists.pkl		dists.pkl
predict.py		predict.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Augmented_CGCNN

Files and directories

Prerequisites

How to augment data from MaterialsProject

How to make a prediction using the pretrained models

Paper

Citation

About

Releases

Packages

Languages

License

JasonGibsonUfl/Augmented_CGCNN

Folders and files

Latest commit

History

Repository files navigation

Augmented_CGCNN

Files and directories

Prerequisites

How to augment data from MaterialsProject

How to make a prediction using the pretrained models

Paper

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages