Skip to content

DeepDelta is a pairwise deep learning approach that processes two molecules simultaneously and learns to predict property differences between two molecules.

License

Notifications You must be signed in to change notification settings

albertma1986/DeepDelta

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepDelta7

Overview

DeepDelta is a pairwise deep learning approach that processes two molecules simultaneously and learns to predict property differences between two molecules.

image Figure 1: Traditional and Pairwise Architectures. (A) Traditional molecular machine learning models take singular molecular inputs and predict absolute properties of molecules. Predicted property differences can be calculated by subtracting predicted values for two molecules. (B) Pairwise models train on differences in properties from pairs of molecules to directly predict property changes of molecular derivatizations. (C) Molecules are cross-merged to create pairs only after cross-validation splits to prevent the risk of data leakage during model evaluation. Through this, every molecule in the dataset can only occur in pairs in the training or testing data but not both.

On 10 pharmacokinetic benchmark tasks, our DeepDelta approach outperforms two established molecular machine learning algorithms, the message passing neural network (MPNN) ChemProp and Random Forest using radial fingerprints.

We also derive three simple computational tests of our models based on first mathematical principles and show that compliance to these tests correlate with overall model performance – providing an innovative, unsupervised, and easily computable measure of expected model performance and applicability.

1. With same molecule for both inputs, predictions should be zero:

$$DeepDelta(x,x)= 0$$

2. With swapped input molecules, predictions should be inversed:

$$DeepDelta(x,y)= - DeepDelta(y,x)$$

3. Predicted difference between three molecules should be additive:

$$DeepDelta(x,y) + DeepDelta(y,z)= DeepDelta(x,z)$$

For more information, please refer to: https://chemrxiv.org/engage/chemrxiv/article-details/642d823f0784a63aee949898

If you use this data or code, please kindly cite: Fralish Z, Chen A, Skaluba P, Reker D. DeepDelta: Predicting Pharmacokinetic Improvements of Molecular Derivatives with Deep Learning. ChemRxiv. Cambridge: Cambridge Open Engage; 2023


Requirements

Comparison Models

Given the larger size of delta datasets, we recommend using a GPU for significantly faster training.

To use ChemProp with GPUs, you will need:

  • cuda >= 8.0
  • cuDNN

Descriptions of Folders

Code

Python code for evaluating DeepDelta and traditional models based on their ability to predict property differences between two molecules.

Datasets

Curated data for 10 ADMET property benchmarking training sets and 2 external test sets.

Results

Results from 5x10-fold cross-validation that are utilized in further analysis.


License

The copyrights of the software are owned by Duke University. As such, two licenses for this software are offered:

  1. An open-source license under the GPLv2 license for non-commercial academic use.
  2. A custom license with Duke University, for commercial use or uses without the GPLv2 license restrictions.

About

DeepDelta is a pairwise deep learning approach that processes two molecules simultaneously and learns to predict property differences between two molecules.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%