You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This feature would be specific to cheminformatics applications.
Scaffold splits are helpful, but are based on ring structures. If I remember correctly, this means it does not account for the side chains or functional groups attached to the rings, and it has no use for molecules that do not have rings (they will just all return an empty scaffold of '').
The workaround to this is to create a vector representation, commonly via Morgan fingerprint, and then pass that vector to any of the supported samplers. Since generating this vector is not restricted to ring-containing species, the process is more generalizable. However, Morgan fingerprints still have some limitations. The vector is very sparse of the small-molecule chemistry our group often focuses on which can cause some odd behavior when trying to analyze similarity metrics. And interestingly, other vector representations (MACCS, Avalon, AtomPair) have not been as good as Morgan so I've been stuck using Morgan for the time being.
This feature would be specific to cheminformatics applications.
Scaffold splits are helpful, but are based on ring structures. If I remember correctly, this means it does not account for the side chains or functional groups attached to the rings, and it has no use for molecules that do not have rings (they will just all return an empty scaffold of
''
).The workaround to this is to create a vector representation, commonly via Morgan fingerprint, and then pass that vector to any of the supported samplers. Since generating this vector is not restricted to ring-containing species, the process is more generalizable. However, Morgan fingerprints still have some limitations. The vector is very sparse of the small-molecule chemistry our group often focuses on which can cause some odd behavior when trying to analyze similarity metrics. And interestingly, other vector representations (MACCS, Avalon, AtomPair) have not been as good as Morgan so I've been stuck using Morgan for the time being.
Here, I propose another method to generate the vectors that would explicitly include functional group information. These links would be a good starting point:
http://rdkit.org/docs/source/rdkit.Chem.Fragments.html
https://forum.knime.com/t/is-there-a-simple-way-to-count-functional-groups/5435/6
It would be interesting to explore this further. I may have time in a few weeks but feel free to get started without me.
The text was updated successfully, but these errors were encountered: