Skip to content
This repository has been archived by the owner on Jul 10, 2024. It is now read-only.

Round-trip Disagreement #3

Open
tylerperyea opened this issue Dec 20, 2013 · 2 comments
Open

Round-trip Disagreement #3

tylerperyea opened this issue Dec 20, 2013 · 2 comments
Labels

Comments

@tylerperyea
Copy link
Contributor

Some structures (usually with several fused rings that contain stereo annotations) don't return the same hash after a round-trip. I'm not sure why this happens.

Example:

[H][C@@]12[C@@H]3SC[C@]4(NCCC5=C4C=C(OC)C(O)=C5)C(=O)OC[C@H](N1[C@@H](O)[C@@H]6CC7=C([C@H]2N6C)C(O)=C(OC)C(C)=C7)C8=C9OCOC9=C(C)C(OC(C)=O)=C38

Yeilds:

DCLRH149F-FFMPLZ16VC-FC1Y2MQMGXU-FCUZ42LBF8VB

But, if the output file is fed through again, I get:

DCLRH149F-FFMPLZ16VC-FC1Y2MQMGXU-FCUZ3C1UCNTD

Each new loop seems to agree with the last hash. This may be due to parity conflict resolution, which seems to be done arbitrarily. If there is ambiguity/conflict, it would probably be better to err on the side of no annotation. However, I think this example does contain enough information to work.

Similarly, this happens with the following (theoretically equivalent) molfile:


  Symyx   02191314562D 1   1.00000     0.00000     0

 55 63  0     1  0            999 V2000
    5.2579  -10.3796    0.0000 N   0  0  3  0  0  0           0  0  0
    5.2643  -11.3463    0.0000 C   0  0  2  0  0  0           0  0  0
    6.9059  -11.3355    0.0000 C   0  0  0  0  0  0           0  0  0
    6.8995  -10.3688    0.0000 C   0  0  0  0  0  0           0  0  0
    3.7293  -11.1063    0.0000 N   0  0  3  0  0  0           0  0  0
    4.4671  -12.0064    0.0000 C   0  0  2  0  0  0           0  0  0
    6.3356   -7.6267    0.0000 C   0  0  1  0  0  0           0  0  0
    3.2045  -12.4015    0.0000 C   0  0  0  0  0  0           0  0  0
    4.4093   -9.9560    0.0000 C   0  0  2  0  0  0           0  0  0
    6.0714  -11.7993    0.0000 C   0  0  1  0  0  0           0  0  0
    3.4158  -10.1363    0.0000 C   0  0  2  0  0  0           0  0  0
    6.0592   -9.9452    0.0000 C   0  0  2  0  0  0           0  0  0
    7.7339  -11.7884    0.0000 C   0  0  0  0  0  0           0  0  0
    7.7217   -9.9342    0.0000 C   0  0  0  0  0  0           0  0  0
    3.2108  -13.3557    0.0000 C   0  0  0  0  0  0           0  0  0
    7.0154   -7.0763    0.0000 C   0  0  0  0  0  0           0  0  0
    8.5767  -11.3245    0.0000 C   0  0  0  0  0  0           0  0  0
    8.5703  -10.3578    0.0000 C   0  0  0  0  0  0           0  0  0
    2.3766  -11.9653    0.0000 C   0  0  0  0  0  0           0  0  0
    5.4048   -8.0161    0.0000 C   0  0  0  0  0  0           0  0  0
    2.3695  -10.8862    0.0000 C   0  0  0  0  0  0           0  0  0
    2.4719  -13.7855    0.0000 C   0  0  0  0  0  0           0  0  0
    7.1803   -8.7253    0.0000 C   0  0  0  0  0  0           0  0  0
    7.7977   -7.5504    0.0000 C   0  0  0  0  0  0           0  0  0
    6.1550   -8.6695    0.0000 O   0  0  0  0  0  0           0  0  0
    7.0097   -6.2138    0.0000 C   0  0  0  0  0  0           0  0  0
    5.7261   -9.3390    0.0000 C   0  0  0  0  0  0           0  0  0
    5.5820   -7.0857    0.0000 N   0  0  0  0  0  0           0  0  0
    1.6316  -13.3661    0.0000 C   0  0  0  0  0  0           0  0  0
    7.7492  -12.8549    0.0000 O   0  0  0  0  0  0           0  0  0
    1.6254  -12.4119    0.0000 C   0  0  0  0  0  0           0  0  0
    8.5362   -7.0663    0.0000 C   0  0  0  0  0  0           0  0  0
    7.9612   -8.9785    0.0000 O   0  0  0  0  0  0           0  0  0
    7.7858   -5.7420    0.0000 C   0  0  0  0  0  0           0  0  0
    8.5302   -6.1580    0.0000 C   0  0  0  0  0  0           0  0  0
    9.4749   -9.7810    0.0000 O   0  0  0  0  0  0           0  0  0
    8.6620  -13.5281    0.0000 C   0  0  0  0  0  0           0  0  0
    8.9890   -8.7634    0.0000 C   0  0  0  0  0  0           0  0  0
    4.5516   -8.1550    0.0000 O   0  0  0  0  0  0           0  0  0
    8.6687  -14.5489    0.0000 O   0  0  0  0  0  0           0  0  0
    4.0768  -13.8833    0.0000 O   0  0  0  0  0  0           0  0  0
    3.1995  -11.6390    0.0000 C   0  0  0  0  0  0           0  0  0
    2.4776  -14.6439    0.0000 O   0  0  0  0  0  0           0  0  0
    5.5763   -6.2233    0.0000 C   0  0  0  0  0  0           0  0  0
    9.4200   -5.8813    0.0000 O   0  0  0  0  0  0           0  0  0
    9.6512  -11.8924    0.0000 C   0  0  0  0  0  0           0  0  0
    9.0177   -7.4215    0.0000 O   0  0  0  0  0  0           0  0  0
    6.2861   -5.8061    0.0000 C   0  0  0  0  0  0           0  0  0
    0.9270  -13.9374    0.0000 C   0  0  0  0  0  0           0  0  0
    9.5507  -13.0639    0.0000 C   0  0  0  0  0  0           0  0  0
    1.6559  -15.1576    0.0000 C   0  0  0  0  0  0           0  0  0
    9.7288   -7.2168    0.0000 C   0  0  0  0  0  0           0  0  0
    6.0443  -12.7495    0.0000 S   0  0  0  0  0  0           0  0  0
    5.2569  -12.1255    0.0000 H   0  0  0  0  0  0           0  0  0
    4.4042   -9.0125    0.0000 O   0  0  0  0  0  0           0  0  0
  2  1  1  0     0  0
  3  4  2  0     0  0
  4 12  1  0     0  0
 11  5  1  6     0  0
  6  2  1  0     0  0
  7 20  1  6     0  0
  8  6  1  0     0  0
  9  1  1  0     0  0
 10  2  1  0     0  0
 11  9  1  0     0  0
 12  1  1  0     0  0
 13  3  1  0     0  0
 14  4  1  0     0  0
 15  8  2  0     0  0
 16  7  1  0     0  0
 17 18  1  0     0  0
 18 14  2  0     0  0
 19 21  1  0     0  0
 20 25  1  0     0  0
 21 11  1  0     0  0
 22 15  1  0     0  0
 24 16  2  0     0  0
 25 27  1  0     0  0
 26 16  1  0     0  0
 12 27  1  1     0  0
 28  7  1  0     0  0
 29 31  1  0     0  0
 30 13  1  0     0  0
 31 19  2  0     0  0
 32 24  1  0     0  0
 33 14  1  0     0  0
 34 26  2  0     0  0
 35 34  1  0     0  0
 36 18  1  0     0  0
 37 30  1  0     0  0
 38 33  1  0     0  0
 39 20  2  0     0  0
 40 37  2  0     0  0
 41 15  1  0     0  0
 42  5  1  0     0  0
 43 22  1  0     0  0
 44 28  1  0     0  0
 45 35  1  0     0  0
 46 17  1  0     0  0
 47 32  1  0     0  0
 48 44  1  0     0  0
 49 29  1  0     0  0
 50 37  1  0     0  0
 51 43  1  0     0  0
 52 47  1  0     0  0
 10 53  1  1     0  0
  2 54  1  6     0  0
 10  3  1  0     0  0
  6  5  1  6     0  0
 19  8  1  0     0  0
  7 23  1  1     0  0
 38 36  1  0     0  0
 17 13  2  0     0  0
 29 22  2  0     0  0
 48 26  1  0     0  0
 35 32  2  0     0  0
 53 23  1  0     0  0
  9 55  1  6     0  0
M  END

Which gets:
java -jar lychi-all-v0.1.jar test.mol

DCLRH149F-FFMPLZ16VC-FC1Y2MQMGXU-FCUS96TNY5ZD

java -jar lychi-all-v0.1.jar test.mol | java -jar lychi-all-v0.1.jar

DCLRH149F-FFMPLZ16VC-FC1Y2MQMGXU-FCUZ3C1UCNTD
@tylerperyea
Copy link
Contributor Author

The simple, poor man's resolution to this, of course, is to simply feed the output of the standardizer back into the standardized, until it stops changing. I'm not yet aware of infinite oscillating hashes, but if they exist, such a procedure could bail out and notify the user of an error...

@caodac
Copy link
Contributor

caodac commented Jan 4, 2014

This is fixed as of commit 9b38dbd. I've also reworked in how we handle stereocenters with explicit Hs.

tylerperyea added a commit that referenced this issue May 1, 2019
fix with some tests for issue #4, #3, #7, needs evaluation
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants