-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accessing all submissions with specified fields times out. #44
Comments
A quick-and-dirty answer that I haven't had a chance to make sure is accurate: it's a really heavyweight request across 800 files. I wonder if it isn't possible to request a much smaller subset at a time? Also, using display_links=false is a more performant version of the REST call if you’re not interested in seeing the hypermedia links, which I think you're not. |
How often are you making this call? Calling the /submissions endpoint is expensive - last I checked we had over 14K submission objects in the triplestore. We don't utilize this endpoint anywhere from our UI. |
Indeed the call shall be optimized with &display_links=false&display_context=false. Maybe @twktheainur already put those parameters. Concerning the call, we will not do this call often. Once from time to time. Which call do you guys use when populating the |
Talking to Clement, he's pretty sure this call returns the most recent submissions only, at least as they are using it. |
@graybeal @jonquet @jvendetti To summarise. The call always retrieves the latest submissions, not only as we are using it. The code in bioportal_web_ui, in the ontologies_controller, confirms that this call is used to populate the Browse page: # The attributes used when retrieving the submission. We are not retrieving all attributes to be faster
browse_attributes = "ontology,acronym,submissionStatus,description,pullLocation,creationDate,released,name,naturalLanguage,hasOntologyLanguage,hasFormalityLevel,isOfType,contact"
submissions = LinkedData::Client::Models::OntologySubmission.all(include_views: true, display_links: false, display_context: false, include: browse_attributes) There is no particular reason why the call should create significantly more load than when someone visits the browse page. Concerning the timeout, after investigation it turns out that it isn't a problem with BioPortal but with our reverse proxy configuration that had a low timeout threshold. The /submissions query actually executes quite fast (much less than 30 seconds) but since we are doing additional processing and a few head requests to check what ontologies are restricted for download (separated by a wait period of 1s in order to avoid any excessive load on the system), it takes significantly longer and results on a reverse-proxy timeout on our end. I believe that this issue may be closed |
I agree. Thanks for the details and the extra care not to tire out our little system! ;-) |
In implementing the OMTD-share wrapper at:
ontoportal-lirmm/ncboproxy#3
We use the following REST call to access all the latest submissions and compile a zip file of their converted metadata description in XML.
data.bioontology.org/submissions?display=description,URI,identifier,version,homepage,documentation,contact,hasLicense,copyrightHolder,download,hasFormalityLevel,naturalLanguage,hasOntologyLanguage,reference,hasCreator,numberOfClasses,numberOfIndividuals,hasDomain,ontologiesRelatedTo,keywords,reference,publication,wasGeneratedBy,isBackwardsCompatibleWith,similarTo,hasPart,acronym,name,viewingRestriction,download,ontology,email
But the call times out most of the time. Except sometimes, when making the call right after the failure, when it is probably cached.
The text was updated successfully, but these errors were encountered: