Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up mesh child terms query #123

Merged
merged 3 commits into from
Nov 1, 2022
Merged

Speed up mesh child terms query #123

merged 3 commits into from
Nov 1, 2022

Conversation

kkaris
Copy link
Collaborator

@kkaris kkaris commented Oct 31, 2022

This PR resolves #122. The slow part has been identified to be querying for mesh child terms over more than one hop while only using isa a relationship type. One query for one of the tests runs for 340 seconds in this case.

For some (still unclear) reason, querying using -[:isa|partof*1..]-> is about 170x faster 🤯 than using just using -[:isa*1..]->, regardless if the result is empty or not when using the below query structure:

MATCH (c:BioEntity)-[:<rel expr>]->(:BioEntity {id: "mesh:<mesh id>"})
RETURN c.id

Below is a table detailing the rough timing in seconds for the above query using different values for mesh id and rel expr:

<rel expr> D015002 (has no children) D007855 (1 direct child and 2 indirect children)
isa 0.4 0.4
isa*1.. 342 342
isa/partof 1.6 1.7
isa/partof*1.. 2.0 2.0
partof/isa 1.6 1.7
partof/isa*1.. 2.0 2.0

The changes in this PR makes use of this and changes the relation in the query from :isa*1.. to :partof|isa*1.. with some post filtering that takes a negligible amount of time.

@bgyori bgyori merged commit 7a0775e into main Nov 1, 2022
@bgyori bgyori deleted the mesh-child-terms branch February 28, 2023 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slow tests
2 participants