Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve molecule geometry prediction #82

Open
ehermes opened this issue Nov 19, 2018 · 1 comment
Open

Improve molecule geometry prediction #82

ehermes opened this issue Nov 19, 2018 · 1 comment

Comments

@ehermes
Copy link
Contributor

ehermes commented Nov 19, 2018

I am trying to convert RMG adjacency list molecule specifications into 3D structures, and I've been running into issues where the predicted structures are unreasonable and do not match chemical intuition. As an illustration of the problem, consider the following script, which generates hydrogen peroxide

from catkit.build import molecule
from ase.visualize import view

view(molecule('H2O2')[0])

This results in the following geometry, with the O-O-H bond highlighted to show a bond angle of 180 degrees:
screen shot 2018-11-19 at 11 11 11

From what I can tell, CatKit determines the bond angles by just looking at the number of bonds each atom has, and assuming the hybridization is whatever a neutral carbon atom would be given the same number of bonds. This breaks for other elements (like oxygen) and for species containing carbon radicals.

Ideally, CatKit would at least be able to generate proper structures for compounds containing oxygen atoms. Perhaps this could be done by connecting dummy "lone pair" atoms to elements like oxygen, then generating the 3D structure and deleting the lone pair sites. This could also be extended to allow for user-specified lone pairs for radical species, which would enable me to quickly convert RMG adjacency lists (which optionally specify lone pairs and lone unpaired electrons) into reasonable 3D structures.

@jboes
Copy link
Collaborator

jboes commented Nov 21, 2018

Hi ehermes, a good idea with the lone-pair dummy. I'll keep this in mind and implement when I can. Not sure when I will have time to get around to this though.

In the meantime, I highly recommend RDKit for this purpose. Their system is much more sophisticated for molecule enumeration (these things get quite complex). If you're still interested in integration with catkit, check my separate utility module: https://github.com/jboes/catkit-utils/blob/master/ckutil/rdk.py. There is a wrapper module called get_uff_coordinates() which will take that poor initial guess and return something more appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants