This repository is the official implementation of DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization.
conda create -n decompdiff python=3.8
conda activate decompdiff
conda install numpy==1.22.3
conda install pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.6 -c pytorch -c nvidia
conda install pyg -c pyg
conda install rdkit openbabel tensorboard pyyaml easydict python-lmdb -c conda-forge
# For decomposition
conda install -c conda-forge mdtraj
pip install alphaspace2
# For Vina Docking
pip install meeko==0.3.0 scipy pdb2pqr vina==1.2.2
python -m pip install git+https://github.com/Valdes-Tresanco-MS/AutoDockTools_py3
We decomposed molecules in CrossDocked2020 trainig set into arms and stored processed data in arm_info_2.pt
, which can be downloaded here. Then we docked arms with target protein with Vina Minimize and obtained docked arm conformations as conditions for training.
python scripts/data_preparation/dock_training_arms.py
We follow the preprocess of DecompDiff. We have provided processed dataset here.
To train the model from scratch, you need to download the *.lmdb
, *_name2id.pt
and split_by_name.pt
files and put them in the ./data
directory. Then, you can run the following command:
python scripts/train_diffusion_decompopt.py configs/training.yml
To sample molecules given protein pockets in the test set, you need to download test_index.pkl
and *_eval.tar.gz
files, unzip it and put them in the ./data
directory. To sample molecules with beta priors, you also need to download beta_priors.zip
and natom_models.pkl
and put them in the ./pregen_info
directory. Then, you can run the following command:
bash scripts/run/sample_compose.sh ${data_id} ${outdir}
This script samples for opt prior by default. We have provided the trained model checkpoint here. You need to download both decompdiff.pt
and decompopt.pt
.
After sampling, Vina Dock is evaluated and the best results are selected:
bash scripts/run/eval_vina_full.sh ${data_id} ${outdir}
python scripts/select_best_arm.py ${outdir}
@inproceedings{
zhou2024decompopt,
title={DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization},
author={Xiangxin Zhou and Xiwei Cheng and Yuwei Yang and Yu Bao and Liang Wang and Quanquan Gu},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=Y3BbxvAQS9}
}