Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoder appears to decode random initial fingerprints to the same SMILES #22

Open
andreirekesh opened this issue Aug 2, 2024 · 0 comments

Comments

@andreirekesh
Copy link

I've recently been interested in running SynNet with the most recent version of the US Stock Enamine BBs. I ran steps 0-2 to preprocess the data and wanted to try reward-guided molecule generation using GA per the instructions in the readme. However, I notice that even with the initial randomly generated fingerprints, 70-80 of the initial 100 are decoded to the same SMILES string:

CC(C)(C)OC(=O)N1CC2NCCN(S(=O)(=O)CC(=O)c3ccccc3)C2C1

This causes the GA population update to hang forever, as insufficient unique new molecules are found to add to the pool and increment parent_idx to num_population in each step of the algorithm.

Could this be the result of the difference in the Enamine stock between the time of publication and now? Any help is appreciated!

Thank you,
Andrei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant