Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug: update Zenodo downloader for new API #373

Closed
wants to merge 4 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 23 additions & 5 deletions pooch/downloaders.py
Original file line number Diff line number Diff line change
Expand Up @@ -807,13 +807,26 @@ def download_url(self, file_name):
-------
download_url : str
The HTTP URL that can be used to download the file.

Notes
-----
After Zenodo migrated to InvenioRDM on Oct 2023, their API changed. The
link to the desired files that appears in the API response leads to 404
errors (by 2023-10-17). The files are available in the following url:
``https://zenodo.org/records/{article_id}/files/{file_name}?download=1``.
"""
files = {item["key"]: item for item in self.api_response["files"]}
if file_name not in files:
# Check if file exists in the repository
filenames = [item["filename"] for item in self.api_response["files"]]
if file_name not in filenames:
raise ValueError(
f"File '{file_name}' not found in data archive {self.archive_url} (doi:{self.doi})."
f"File '{file_name}' not found in data archive "
f"{self.archive_url} (doi:{self.doi})."
)
download_url = files[file_name]["links"]["self"]
# Build download url
article_id = self.api_response["id"]
download_url = (
f"https://zenodo.org/records/{article_id}/files/{file_name}?download=1"
)
return download_url

def populate_registry(self, pooch):
Expand All @@ -824,10 +837,15 @@ def populate_registry(self, pooch):
----------
pooch : Pooch
The pooch instance that the registry will be added to.

Notes
-----
After Zenodo migrated to InvenioRDM on Oct 2023, their API changed. The
checksums for each file listed in the API reference is now an md5 sum.
"""

for filedata in self.api_response["files"]:
pooch.registry[filedata["key"]] = filedata["checksum"]
pooch.registry[filedata["filename"]] = f"md5:{filedata['checksum']}"


class FigshareRepository(DataRepository): # pylint: disable=missing-class-docstring
Expand Down
Loading