Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Split based on functional groups #145

Open
kspieks opened this issue Jul 13, 2023 · 0 comments
Open

[FEATURE]: Split based on functional groups #145

kspieks opened this issue Jul 13, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@kspieks
Copy link
Collaborator

kspieks commented Jul 13, 2023

This feature would be specific to cheminformatics applications.

Scaffold splits are helpful, but are based on ring structures. If I remember correctly, this means it does not account for the side chains or functional groups attached to the rings, and it has no use for molecules that do not have rings (they will just all return an empty scaffold of '').

The workaround to this is to create a vector representation, commonly via Morgan fingerprint, and then pass that vector to any of the supported samplers. Since generating this vector is not restricted to ring-containing species, the process is more generalizable. However, Morgan fingerprints still have some limitations. The vector is very sparse of the small-molecule chemistry our group often focuses on which can cause some odd behavior when trying to analyze similarity metrics. And interestingly, other vector representations (MACCS, Avalon, AtomPair) have not been as good as Morgan so I've been stuck using Morgan for the time being.

Here, I propose another method to generate the vectors that would explicitly include functional group information. These links would be a good starting point:
http://rdkit.org/docs/source/rdkit.Chem.Fragments.html
https://forum.knime.com/t/is-there-a-simple-way-to-count-functional-groups/5435/6
It would be interesting to explore this further. I may have time in a few weeks but feel free to get started without me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant