You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Set query molecule to didD@@QInUxV`@@B and run using the idorsia_toy_space_a.txt synthon space.
Expected behavior:
For Synthon_A of snar_b-25 there should be hits for four different synthons: dcLDpEtKhhbSiIf^v[hHBf@@, dcLDpEtKichYAIeY~kh@bf@@, dmtDPITKickHhdhcJz@Hf@@ and dcNDPAWPnfNdfUgzn`BJX@@ (560 in total).
Actual result:
Results for only one synthon are returned (440 in total).
Probable cause:
Break in line 2763 and 2767 of SynthonSpace.java. As a result mapped_frag is always of size 1. This seems to be done on purpose, but results in rather unexpected behavior.
The text was updated successfully, but these errors were encountered:
Thanks for testing the hyperspace software thoroughly!
The observed behavior is indeed "a feature" and intentional. One of the challenges of implementing the algorithm was to handle "very general" queries in a reasonable way. The reasoning behind this "break" is, that in case that we have a complete substructure hit inside a single building block, then we assume that the query is very general and will probably generate millions of hits (in the toy space, 500 / 37k is >1% of the complete space, in large spaces this might be millions or billions of structures, probably "more than we can easily handle in subsequent processing of the structures"). The measures for handling "excessive results" are also described in the JCIM publication in the subsection "Handling excessive enumeration of results" (it was not in the preprint but was rightly requested by one of the reviewers).
I agree that this is somewhat confusing, and it might cut off interesting structures. The first implementation of the software was using a "process all structures, then report the full result at once" approach, therefore it was really necessary to have such rather strict cutoff criteria. I extended the software and now it can also "continuously stream" results, i.e. it could be an option to remove these cutoff mechanisms in the algorithm. I don't know if this is really helpful though, alternatively it could make sense to include in the results a "results might be truncated" flag in case that one of the cutoff mechanisms is engaged (there are also two other hard limits in the algorithm).
Steps to reproduce:
Set query molecule to
didD@@QInUxV`@@B
and run using theidorsia_toy_space_a.txt
synthon space.Expected behavior:
For
Synthon_A
ofsnar_b-25
there should be hits for four different synthons:dcLDpEtKhhbSiIf^v[hHBf@@
,dcLDpEtKichYAIeY~kh@bf@@
,dmtDPITKickHhdhcJz@Hf@@
anddcNDPAWPnfNdfUgzn`BJX@@
(560 in total).Actual result:
Results for only one synthon are returned (440 in total).
Probable cause:
Break in line 2763 and 2767 of SynthonSpace.java. As a result
mapped_frag
is always of size 1. This seems to be done on purpose, but results in rather unexpected behavior.The text was updated successfully, but these errors were encountered: