-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibility on speeding the calculations up? #19
Comments
There's no easy way to make it faster. It's highly parallelized, so if you have access to a machine with more CPU cores it will speed up, but that doesn't really count. We could do some profiling to find which descriptors are the slowest to calculate and then try and speed those up, if you are interested! |
Hi @JacksonBurns ! Thank you so much for the reply and suggestions. Would it be too troublesome for you to do profiling to find which descriptors are the slowest to calculate? |
Sure - I put together this small demo (mordred_profile.json) which you download, change the extension to @classmethod
def from_query(cls, mol, require_3D, explicit_hydrogens, kekulizes, id, config):
if not isinstance(mol, Chem.Mol):
raise TypeError("{!r} is not rdkit.Chem.Mol instance".format(mol))
n_frags = len(Chem.GetMolFrags(mol))
if mol.HasProp("_Name"):
name = mol.GetProp("_Name")
else:
name = Chem.MolToSmiles(Chem.RemoveHs(mol, updateExplicitCount=True))
mols, coords = {}, {}
for eh, ke in ((eh, ke) for eh in explicit_hydrogens for ke in kekulizes):
m = Chem.AddHs(mol) if eh else Chem.RemoveHs(mol, updateExplicitCount=True)
if ke:
Chem.Kekulize(m)
if require_3D:
try:
conf = m.GetConformer(id)
if conf.Is3D():
coords[eh, ke] = conformer_to_numpy(conf)
except ValueError:
pass
m.RemoveAllConformers()
mols[eh, ke] = m
return cls(mols, coords, n_frags, name, config) inside |
Hi @JacksonBurns, Thank you for the demo! I will have a look at this and hopefully come back with good news! I actually tried to utilize more CPU cores, and it actually sped up quite well (a rough estimation would be x10), but you're right, it doesn't really count. |
Hi all, I have a question to ask, is there a possibility that we can actually speed the calculations up? It's awesome that mordred is still maintained!
Why speed calculations up?
I have about 90,000 molecules to calculate chemical descriptors and it takes somewhere between 4 hours to 5 hours.
The code
The text was updated successfully, but these errors were encountered: