Accessing all submissions with specified fields times out. #44

jonquet · 2018-04-16T17:13:51Z

In implementing the OMTD-share wrapper at:
ontoportal-lirmm/ncboproxy#3

We use the following REST call to access all the latest submissions and compile a zip file of their converted metadata description in XML.

data.bioontology.org/submissions?display=description,URI,identifier,version,homepage,documentation,contact,hasLicense,copyrightHolder,download,hasFormalityLevel,naturalLanguage,hasOntologyLanguage,reference,hasCreator,numberOfClasses,numberOfIndividuals,hasDomain,ontologiesRelatedTo,keywords,reference,publication,wasGeneratedBy,isBackwardsCompatibleWith,similarTo,hasPart,acronym,name,viewingRestriction,download,ontology,email

But the call times out most of the time. Except sometimes, when making the call right after the failure, when it is probably cached.

graybeal · 2018-04-16T23:01:45Z

A quick-and-dirty answer that I haven't had a chance to make sure is accurate:

it's a really heavyweight request across 800 files. I wonder if it isn't possible to request a much smaller subset at a time?

Also, using display_links=false is a more performant version of the REST call if you’re not interested in seeing the hypermedia links, which I think you're not.

jvendetti · 2018-04-16T23:04:58Z

How often are you making this call? Calling the /submissions endpoint is expensive - last I checked we had over 14K submission objects in the triplestore. We don't utilize this endpoint anywhere from our UI.

jonquet · 2018-04-17T00:06:07Z

Indeed the call shall be optimized with &display_links=false&display_context=false. Maybe @twktheainur already put those parameters.

Concerning the call, we will not do this call often. Once from time to time.
However, note the /submissions call only return the latest submission, not all of them.
At least this is like this on our appliances ;)

Which call do you guys use when populating the Browsepage ?

graybeal · 2018-04-20T20:05:34Z

Talking to Clement, he's pretty sure this call returns the most recent submissions only, at least as they are using it.

twktheainur · 2018-04-23T12:26:22Z

@graybeal @jonquet @jvendetti To summarise.

The call always retrieves the latest submissions, not only as we are using it. The code in bioportal_web_ui, in the ontologies_controller, confirms that this call is used to populate the Browse page:

 # The attributes used when retrieving the submission. We are not retrieving all attributes to be faster
    browse_attributes = "ontology,acronym,submissionStatus,description,pullLocation,creationDate,released,name,naturalLanguage,hasOntologyLanguage,hasFormalityLevel,isOfType,contact"
    submissions = LinkedData::Client::Models::OntologySubmission.all(include_views: true, display_links: false, display_context: false, include: browse_attributes)

There is no particular reason why the call should create significantly more load than when someone visits the browse page.

Concerning the timeout, after investigation it turns out that it isn't a problem with BioPortal but with our reverse proxy configuration that had a low timeout threshold. The /submissions query actually executes quite fast (much less than 30 seconds) but since we are doing additional processing and a few head requests to check what ontologies are restricted for download (separated by a wait period of 1s in order to avoid any excessive load on the system), it takes significantly longer and results on a reverse-proxy timeout on our end.

I believe that this issue may be closed

graybeal · 2018-04-23T18:56:48Z

I agree. Thanks for the details and the extra care not to tire out our little system! ;-)

jonquet mentioned this issue Apr 16, 2018

Extension of the OMTD-AgroPortal to NCBO BioPortal and BiblioPortal ontoportal-lirmm/ncboproxy#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accessing all submissions with specified fields times out. #44

Accessing all submissions with specified fields times out. #44

jonquet commented Apr 16, 2018 •

edited by alexskr

Loading

graybeal commented Apr 16, 2018

jvendetti commented Apr 16, 2018

jonquet commented Apr 17, 2018 •

edited

Loading

graybeal commented Apr 20, 2018

twktheainur commented Apr 23, 2018 •

edited

Loading

graybeal commented Apr 23, 2018 •

edited

Loading

Accessing all submissions with specified fields times out. #44

Accessing all submissions with specified fields times out. #44

Comments

jonquet commented Apr 16, 2018 • edited by alexskr Loading

graybeal commented Apr 16, 2018

jvendetti commented Apr 16, 2018

jonquet commented Apr 17, 2018 • edited Loading

graybeal commented Apr 20, 2018

twktheainur commented Apr 23, 2018 • edited Loading

graybeal commented Apr 23, 2018 • edited Loading

jonquet commented Apr 16, 2018 •

edited by alexskr

Loading

jonquet commented Apr 17, 2018 •

edited

Loading

twktheainur commented Apr 23, 2018 •

edited

Loading

graybeal commented Apr 23, 2018 •

edited

Loading