-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #4 from SeonghwanSeo/develop
Develop Performance & OpenPharmaco Release
- Loading branch information
Showing
1,025 changed files
with
1,864 additions
and
460,302 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,8 @@ | |
weights | ||
run.sh | ||
result/ | ||
examples/library/ | ||
pyproject.toml | ||
|
||
|
||
# Byte-compiled / optimized / DLL files | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,58 +1,60 @@ | ||
# PharmacoNet: Open-source Protein-based Pharmacophore Modeling | ||
|
||
**Before using PharmacoNet, also consider using PharmacoGUI - GUI powered by PharmacoNet.** | ||
**Before using PharmacoNet, consider using OpenPharmaco - GUI powered by PharmacoNet.** | ||
|
||
**[PharmacoGUI Github](https://github.com/SeonghwanSeo/PharmacoGUI) (Released in May-June)** | ||
**[OpenPharmaco Github](https://github.com/SeonghwanSeo/OpenPharmaco)** | ||
|
||
Accepted in ***NeurIPS Workshop 2023 (AI4D3 | New Frontiers of AI for Drug Discovery and Development)*** [[arxiv](https://arxiv.org/abs/2310.00681)] | ||
Accepted in **_NeurIPS Workshop 2023 (AI4D3 | New Frontiers of AI for Drug Discovery and Development)_** [[arxiv](https://arxiv.org/abs/2310.00681)] | ||
|
||
Official Github for ***PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep Pharmacophore Modeling*** by Seonghwan Seo* and Woo Youn Kim. | ||
Official Github for **_PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep Pharmacophore Modeling_** by Seonghwan Seo\* and Woo Youn Kim. | ||
|
||
1. Fully automated protein-based pharmacophore modeling based on image instance segmentation modeling | ||
2. Coarse-grained graph matching at the pharmacophore level for high throughput | ||
3. Pharmacophore-aware scoring function with parameterized analytical function for robust generalization ability | ||
|
||
PharmacoNet is an extremely rapid yet reasonably accurate ligand evaluation tool with high generation ability. | ||
|
||
If you have any problems or need help with the code, please add an issue or contact [[email protected]](mailto:[email protected]). | ||
If you have any problems or need help with the code, please add an github issue or contact [[email protected]](mailto:[email protected]). | ||
|
||
![](images/overview.png) | ||
|
||
|
||
|
||
## Quick Start | ||
|
||
```bash | ||
# Pharmacophore Modeling | ||
python modeling.py --pdb <PDB ID> --cuda # RCSB PDB importing, CUDA Acceleration | ||
python modeling.py --protein <PROTEIN_PATH> | ||
python modeling.py --protein <PROTEIN_PATH> --ref_ligand <REF_LIGAND_PATH> | ||
python modeling.py --pdb <PDB ID> # RCSB PDB importing | ||
python modeling.py --protein <PROTEIN_PATH> --prefix <EXP_NAME> --cuda # CUDA acceleration | ||
python modeling.py --protein <PROTEIN_PATH> --prefix <EXP_NAME> --ref_ligand <REF_LIGAND_PATH> | ||
|
||
# Virtual Screening | ||
python screening.py -p <MODEL_PATH> --library <LIBRARY_DIR> --out <RESULT_PATH> --cpus <NCPU> | ||
|
||
# Feature Extraction for Deep Learning Developer | ||
python feature_extraction.py | ||
python modeling.py --protein <PROTEIN_PATH> --cuda # CUDA Acceleration | ||
# Feature Extraction for Deep Learning Researcher | ||
python feature_extraction.py --protein <PROTEIN_PATH> --ref_ligand <REF_LIGAND_PATH> --out <SAVE_PKL_PATH> | ||
python feature_extraction.py --protein <PROTEIN_PATH> --center <X> <Y> <Z> --out <SAVE_PKL_PATH> --cuda | ||
``` | ||
|
||
#### Installation with `environment.yml` | ||
|
||
For various environment including Linux, MacOS and Window, the script installs **cpu-only version of PyTorch** by default. You can install a cuda-available version by modifying `environment.yml` or installing PyTorch manually. | ||
|
||
```bash | ||
conda create -f environment.yml | ||
conda activate pmnet | ||
``` | ||
|
||
## Environment | ||
#### Manual Installation | ||
|
||
```shell | ||
# Required python>=3.9, Best Performance at higher version. (3.9, 3.10, 3.11, 3.12 - best) | ||
conda create --name pmnet python=3.10 | ||
conda create --name pmnet python=3.10 openbabel=3.1.1 pymol-open-source=3.0.0 numpy=1.26 | ||
conda activate pmnet | ||
conda install openbabel pymol-open-source | ||
|
||
pip install torch torchvision # torch >= 1.13, CUDA acceleration is available. 1min for 1 cpu, 10s for 1 gpu | ||
pip install rdkit biopython omegaconf timm numba # Numba is optional, but recommended. | ||
pip install molvoxel # https://github.com/SeonghwanSeo/molvoxel.git | ||
pip install torch # torch >= 1.13, CUDA acceleration is available. 1min for 1 cpu, 10s for 1 gpu | ||
pip install rdkit biopython omegaconf numba # Numba is optional, but recommended. | ||
pip install molvoxel # Molecular voxelization tools with minimal dependencies (https://github.com/SeonghwanSeo/molvoxel.git) | ||
``` | ||
|
||
|
||
|
||
## Pharmacophore Modeling | ||
|
||
You can run `model.py` for automated protein-based pharmacophore modeling with RCSB PDB code or custom protein path (`--protein`). With protein path, you should enter `--prefix`. | ||
|
@@ -86,7 +88,7 @@ Ligand 3 | |
- Synonyms: 6-FLUORO-7-(2-FLUORO-6-HYDROXYPHENYL)-4-[(2S)-2-METHYL-4-PROPANOYLPIPERAZIN-1-YL]-1-[4-METHYL-2-(PROPAN-2-YL)PYRIDIN-3-YL]PYRIDO[2,3-D]PYRIMIDIN-2(1H)-ONE | ||
|
||
INFO:root:Select the ligand number(s) (ex. 3 ; 1,3 ; manual ; all ; exit) | ||
ligand number:3 # USER INPUT: Enter the ligand number for binding site detection | ||
ligand number:3 # USER INPUT: Enter the ligand number for binding site detection | ||
INFO:root:Running 3th Ligand... | ||
Ligand 3 | ||
- ID : MOV (Chain: D [auth A]) | ||
|
@@ -114,16 +116,14 @@ INFO:root:Load PharmacoNet finish | |
INFO:root:Load examples/6OIM_protein.pdb | ||
WARNING:root:No ligand is detected! | ||
INFO:root:Enter the center of binding site manually: | ||
x: 2 # USER INPUT: Enter x | ||
y: -8 # USER INPUT: Enter y | ||
z: -1 # USER INPUT: Enter z | ||
x: 2 # USER INPUT: Enter x | ||
y: -8 # USER INPUT: Enter y | ||
z: -1 # USER INPUT: Enter z | ||
INFO:root:Using center (2.0, -8.0, -1.0) | ||
INFO:root:Save Pharmacophore Model to result/6OIM/6OIM_2.0_-8.0_-1.0_model.pm | ||
INFO:root:Save Pymol Visualization Session to result/6OIM/6OIM_2.0_-8.0_-1.0_model.pse | ||
``` | ||
|
||
|
||
|
||
## Virtual Screening | ||
|
||
We provide the simple script for screening. | ||
|
@@ -139,11 +139,9 @@ python screening.py -p <MODEL_PATH> --library <LIBRARY_DIR> --out <RESULT_PATH> | |
|
||
# Example | ||
python screening.py -p ./result/6oim/6oim_D_MOV_model.pm --library examples/library --out result.csv --cpus 1 | ||
python screening.py -p ./result/6oim/6oim_D_MOV_model.pm --library examples/library --out result.csv --cpus 1 --hbd 5 --hba 5 --aromatic 8 | ||
python screening.py -p ./result/6oim/6oim_D_MOV_model.pm --library examples/library --out result.csv --cpus 2 --hbd 5 --hba 5 --aromatic 8 | ||
``` | ||
|
||
|
||
|
||
#### Example python code for ligand evaluation | ||
|
||
Also, it can be easily included in your custom script via the python code below. (\* Multiprocessing is allowed) | ||
|
@@ -153,14 +151,12 @@ from pmnet import PharmacophoreModel | |
model = PharmacophoreModel.load(<PHARMCOPHORE_MODEL_PATH>) | ||
|
||
# NOTE: Scoring with ligand file with 1 or more conformers | ||
score = model.scoring_file(<LIGAND_PATH>) # SDF, MOL2, PDB | ||
score = model.scoring_file(<LIGAND_PATH>) # SDF, MOL2, PDB | ||
|
||
# NOTE: Scoring with RDKit ETKDG Conformers | ||
score = model.scoring_smiles(<SMILES>, <NUM_CONFORMERS>) | ||
``` | ||
|
||
|
||
|
||
## Pharmacophore Feature Extraction | ||
|
||
For deep learning researcher who want to use PharmacoNet as pre-trained model for feature extraction, we provide the script `feature_extraction.py`. | ||
|
@@ -170,11 +166,49 @@ python feature_extraction.py --protein <PROTEIN_PATH> --ref_ligand <REF_LIGAND_P | |
python feature_extraction.py --protein <PROTEIN_PATH> --center <X> <Y> <Z> --out <SAVE_PKL_PATH> | ||
``` | ||
|
||
#### Paper List | ||
```bash | ||
PHARMACOPHORE NODE FEATURE LIST: List[Dict[str, Any]] | ||
PHARMACOPHORE NODE FEATURE: Dict[str, Any] | ||
- feature: NDArray[np.float32] | ||
- type: str (7 types) | ||
{'Hydrophobic', 'Aromatic', 'Cation', 'Anion', | ||
'Halogen', 'HBond_donor', 'HBond_acceptor'} | ||
*** `type` is obtained from `nci_type`. | ||
- nci_type: str (10 types) | ||
'Hydrophobic': Hydrophobic interaction | ||
'PiStacking_P': Pi-Pi Stacking (Parallel) | ||
'PiStacking_T': Pi-Pi Stacking (T-shaped) | ||
'PiCation_lring': Cation-Pi Interaction btw Protein Cation & Ligand Aromatic Ring | ||
'PiCation_pring': Cation-Pi Interaction btw Protein Aromatic Ring & Ligand Cation | ||
'SaltBridge_pneg': SaltBridge btw Protein Anion & Ligand Cation | ||
'SaltBridge_lneg': SaltBridge btw Protein Cation & Ligand Anion | ||
'HBond_pdon': Hydrogen Bond btw Protein Donor & Ligand Acceptor | ||
'HBond_ldon': Hydrogen Bond btw Protein Acceptor & Ligand Donor | ||
'XBond': Halogen Bond | ||
- priority_score: float in [0, 1] | ||
- hotspot_position: tuple[float, float, float] - (x, y, z) | ||
- center: tuple[float, float, float] - (x, y, z) | ||
- radius: float | ||
``` | ||
- TacoGFN [[paper](https://arxiv.org/abs/2310.03223)] | ||
### Python Script | ||
For feature extraction, it is recommended to use `score_threshold=0.5` instead of default setting used for pharmacophore modeling. If you want to extract more features, decrease the `score_threshold`. | ||
```python | ||
from pmnet.module import PharmacoNet | ||
|
||
module = PharmacoNet( | ||
"cuda", | ||
score_threshold = 0.5 # <SCORE_THRESHOLD: float | dict[str, float], recommended=0.5>, | ||
) | ||
|
||
pharmacophore_node_feature_list = module.feature_extraction(<PROTEIN_PATH>, center=(<X>, <Y>, <Z>)) | ||
``` | ||
### Paper List | ||
- TacoGFN [[paper](https://arxiv.org/abs/2310.03223)] | ||
## Citation | ||
|
@@ -189,4 +223,3 @@ Paper on [arxiv](https://arxiv.org/abs/2310.00681) | |
url = {https://arxiv.org/abs/2310.00681}, | ||
} | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: openph | ||
channels: | ||
- conda-forge | ||
dependencies: | ||
- python=3.11 | ||
- pip=24.0 | ||
- openbabel=3.1.1 | ||
- pymol-open-source=3.0.0 | ||
- numpy=1.26 | ||
- pip: | ||
- torch==1.13.1 | ||
- molvoxel==0.1.3 | ||
- numba==0.59 | ||
- omegaconf==2.3.0 | ||
- gdown==5.1.0 | ||
- biopython==1.83 |
Binary file not shown.
Oops, something went wrong.