GO-CAM API / blazegraph query unable to handle large imported models #5

kltm · 2022-05-23T21:59:23Z

It seems that the GO-CAM API query (as explained geneontology/api-gorest#3 (comment)) is not able to complete in the allotted 60s time on the current machine. Moreover, whatever it is doing with resources prevents other queries from running and often brings down the service, sometimes the blazegraph instance itself.

We are currently mitigating this with a decreased 30s timeout (down from 60s), which seems to prevent the MGI queries from overheating blazegraph (and at least the interface shows a "0", even if it's because of the error. As well, we still have the hourly "production" blazegraph restart in place, just in case.

Moving forward, we'll need to find a way to either

Speed up the query or prevent it from running on overly large models (I.e. less than X nodes)
Implement Implement new metadata tags in Noctua / GO-CAM universe to support distinguishing standard and causal models noctua#746 and get that into the0 GO-CAM API

Tagging @balhoff @dustine32 @sierra-moxon @vanaukenk @tmushayahama

kltm · 2022-05-24T17:20:01Z

Due to continued fluttering (about a dozen times since end-of-work yesterday), I have increased the restarts timing from 60m to 30m.

Improved causal-by-GP query by Jim for geneontology/api-gorest#5

kltm · 2022-05-24T20:21:27Z

With a little testing, we can see that we are now having a new problem: may identifiers are causing 413 "payload too large" errors. For example:

wb/WBGene00004488
rgd/1564080
mgi/MGI:3781580
hgnc/16171
sgd/S000005537
flybase/FBgn0025334

@dustine32 reverting to older code for now, until we can work out with @balhoff what is going wrong.

Remove possibly excessive whitespace in an attempt to help with #5

kltm · 2022-05-25T20:33:23Z

Okay, I've done a bunch more testing here and would like to make the following notes:

while certain identifiers return the 413 error a lot, it is not consistent
- when getting a 413, it tends to continue
- sometimes it will start working (i.e. 200 and correct results); when this happens it tends to keep working (until it changes state again)
I'm no longer convinced that it's an issue with only apache2, put possibly in interacting with blazegraph; looking at debugging messages for the proxy, apache2 is apparently successfully connecting to blazegraph, with the 413 occurring later on; if it was only incoming to apache2 that was the problem, I would believe that the 413 would have occurred before the proxy making the connection

That said, debugging has not given much in the way of really understanding what is going on. As a "fingers crossed" approach and assuming that the issue is being caused by the literal query length, I've made an attempt to compress the query a little with #7 . Locally, I've now been unable to produce a 413 for some time, which was not previously the case. I could just be lucky right now, but I think this may be a "fix" for our current issue.

kltm · 2022-05-25T21:29:33Z

Okay, we've had about 20m, 200 queries, and no errors--which is a pretty huge success compared to where we've been operating since last Friday.
I think, unless something else comes up, we can close this one up.

Thank you for your help on this @balhoff and @dustine32 !

kltm added the bug Something isn't working label May 23, 2022

dustine32 referenced this issue May 24, 2022

Improved causal-by-GP query by Jim for geneontology/api-gorest#5

9966d1b

dustine32 mentioned this issue May 24, 2022

Improved causal-by-GP query by Jim for geneontology/api-gorest#5 #4

Merged

kltm referenced this issue May 24, 2022

Merge pull request #4 from geneontology/api-gorest-5-causal-gp-speedup

30622f9

Improved causal-by-GP query by Jim for geneontology/api-gorest#5

kltm transferred this issue from geneontology/api-gorest May 24, 2022

kltm added a commit that referenced this issue May 25, 2022

Remove excess whitespace

ae66cbe

Remove possibly excessive whitespace in an attempt to help with #5

kltm mentioned this issue May 25, 2022

Remove excess whitespace #7

Merged

kltm closed this as completed May 25, 2022

kltm mentioned this issue Jun 8, 2022

GO-CAM production triple store 5-10x slower geneontology/noctua#738

Closed

kltm added this to Ongoing data QC and pipeline maintenance Aug 22, 2024

kltm moved this to Done in Ongoing data QC and pipeline maintenance Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GO-CAM API / blazegraph query unable to handle large imported models #5

GO-CAM API / blazegraph query unable to handle large imported models #5

kltm commented May 23, 2022

kltm commented May 24, 2022 •

edited

Loading

kltm commented May 24, 2022

kltm commented May 25, 2022

kltm commented May 25, 2022

GO-CAM API / blazegraph query unable to handle large imported models #5

GO-CAM API / blazegraph query unable to handle large imported models #5

Comments

kltm commented May 23, 2022

kltm commented May 24, 2022 • edited Loading

kltm commented May 24, 2022

kltm commented May 25, 2022

kltm commented May 25, 2022

kltm commented May 24, 2022 •

edited

Loading