Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk export of summary data for granules #152

Open
Dorado1987A opened this issue Mar 13, 2024 · 2 comments
Open

Bulk export of summary data for granules #152

Dorado1987A opened this issue Mar 13, 2024 · 2 comments
Assignees

Comments

@Dorado1987A
Copy link

Hi,

I previously commented on a ticket last year under my other account - RogueSergeant - about the search function of the API.

It's working great.

However, I now need to "hydrate" a lot of the information that's returned in bulk. Currently, the only way I see this being possible, is by looping through every result and running a get call to each's summary link. This is clearly inefficient, and puts a lot of strain on your systems, that I don't want to cause!

An example:

https://api.govinfo.gov/packages/CHRG-112shrg68086/granules/CHRG-112shrg68086/summary

Is there any way to get a paginated response from the API by providing a list of these URLs, or Package/Granule IDs.

Thanks,

Dorado

@jonquandt jonquandt self-assigned this Mar 14, 2024
@jonquandt
Copy link
Member

I see that you are interested in Congressional hearings. What kind of information are you trying to get from the granule summaries?

From the search service response, you should be able to directly download the mods file, which includes metadata about the result in XML format.

If you set the resultLevel to "package", you'll be able to get the package-level MODS that includes links and metadata associated with each granule in a single file, which may be more efficient for your purposes.

When you ask about returning a paginated list given a set of package or granule ids, do you mean you would like a paginated response that essentially contains an array with all of the summary information together?

That currently isn't possible and would likely be more taxing on our system to produce than to generate the individual summaries. From our perspective, we're fairly well equipped to deal with a large number of requests, so pulling individual summaries isn't an issue on our end at this time.

If we built functionality allowing you to specify specific fields to return in the search response, would that be helpful?

@Dorado1987A
Copy link
Author

Hi, yes a function to add requested fields to the search result would be fantastic. I'm essentially trying to add abstracts and additional depth to the results returned and trying to do so as efficiently as possible!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants