Skip to content
This repository has been archived by the owner on Nov 19, 2024. It is now read-only.

Allow top/skip on GET https://stitcher.ncats.io/api/stitches/latest when using filters #145

Open
snowzach opened this issue Nov 20, 2020 · 4 comments

Comments

@snowzach
Copy link

The API call appears to support the top and skip query options but they don't actually work.

@snowzach snowzach changed the title Add pagination to GET https://stitcher.ncats.io/api/stitches/latest Fix top/skip on GET https://stitcher.ncats.io/api/stitches/latest Nov 20, 2020
@snowzach snowzach changed the title Fix top/skip on GET https://stitcher.ncats.io/api/stitches/latest Allow top/skip on GET https://stitcher.ncats.io/api/stitches/latest when using filters Nov 20, 2020
@southalln
Copy link
Contributor

southalln commented Nov 20, 2020

Can you provide an example URL that demonstrates the kind of filtering that you are doing?
Simple stuff like http://stitcher.ncats.io/api/stitches/latest?skip=10&top=5 seems to be working

@snowzach
Copy link
Author

Sure... I was trying with /api/stitches/latest?filter=Status/Launched and I attempted a pull request #147 with some changes that might work. I haven't gotten it compile yet. I am Java stupid.

@southalln
Copy link
Contributor

OK. What is going on here is a little complicated. Let me try and break that down as best I can. TLDR = we run out of memory on the filter step, but also I am not sure this filter is giving you what you really want.

First, the "status" property comes from a couple of sources - "Launched" as a value specifically comes from Broad. You can see all the places "status" comes from by looking at https://stitcher.ncats.io/api/datasources and specifically looking for which sources provide a "status" property.

NOTE: this "status" is different from the status shown at drugs.ncats.io. Inxight: Drugs calculates its regulatory status from the "highestPhase" property at the stitch object top level. The value of "highestPhase" property corresponds to the ID of a regulatory event from the "events" array. That referenced event contains the highest development status achieved and citation information for that event. That means, however, that is not possible to simply filter the stitches in the way that you might want for all of the marketed drugs. This should probably be the subject of another issue/feature request.

If we look at other properties, such as the one "WIKIPEDIA" from GSRS ... top and skip work just fine
https://stitcher.ncats.io/api/stitches/latest?filter=WIKIPEDIA
https://stitcher.ncats.io/api/stitches/latest?filter=WIKIPEDIA&top=3
https://stitcher.ncats.io/api/stitches/latest?filter=WIKIPEDIA&top=3&skip=2

In fact, https://stitcher.ncats.io/api/stitches/latest?filter=WIKIPEDIA/ALOSETRON&top=3&skip=2 properly filters down to one single record, and the now meaningless top and skip are reset to 1 and 0, respectively in the response.

Now, looking specifically at /api/stitches/latest?filter=status/Launched
something funny is going on. On my development instance, this produces an out of memory error. In prod, the server might be crashing, restarting and not producing any useful kind of response for you. I believe there must be a couple of really large records in the approved products list that ruin the memory. In fact, when page over all of the stitches, I increment in steps of 10. I can page through all of the json for all stitches in about 10 minutes on the prod server.

Leaving the current issue as a bug report ... we need to protect the server from out of memory issues, perhaps by truncating some of the stitch records, somehow.

@snowzach
Copy link
Author

Could you guys have the server setup to auto-restart if it crashes?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants