Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: implementation of a pooch downloader for the PDR #350

Open
jat255 opened this issue Nov 7, 2024 · 2 comments
Open

Feature Request: implementation of a pooch downloader for the PDR #350

jat255 opened this issue Nov 7, 2024 · 2 comments

Comments

@jat255
Copy link
Member

jat255 commented Nov 7, 2024

Hi @RayPlante, not sure if this is the best place to put this, but I was hoping the team might consider a small effort to implement/work with the pooch project to support the programmatic downloading of data from data.nist.gov. If you're not familiar, pooch has become a pretty widely-used tool in the scientific Python community for downloading datasets and other web resources, with tools for built in caching and some other nifty tricks.

The coolest part (to me) is the DOIDownloader class that allows you to say pooch.retrieve("doi:10.6084/m9.figshare.14763051.v1/tiny-data.txt"), and it will parse the DOI and download the underlying data all at once. Currently, there is support in this class for figshare, Zenodo, and Dataverse instances. I think adding support for the NIST PDR could do a lot for interoperability.

One use case internally: our package ETSpy has a few datasets included for testing and demonstration that are currently distributed with the package (not ideal, as it bloats the size of the package). The common way of dealing with this is to host the files in a repo somewhere and then use pooch to fetch them on demand as-needed and cache for later use. Most commonly, Zenodo is used for this, but since it's a NIST project, it would be preferred (required?) to host those in the PDR. Being able to easily use pooch for that with a DOI would be great.

Assuming the pooch team is open to it, I may have some cycles to work on this interoperability bit if it's of interest to the team.

@jat255
Copy link
Member Author

jat255 commented Nov 7, 2024

I just went ahead and did it today: fatiando/pooch#442

@GRG2
Copy link
Collaborator

GRG2 commented Nov 8, 2024

@jat255 this looks interesting, very cool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants