Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDS changing infrastructure: changes to users and possibly some cmorizers #3750

Open
bettina-gier opened this issue Sep 16, 2024 · 4 comments

Comments

@bettina-gier
Copy link
Contributor

The Copernicus Data Store is changing their infrastructure, with the current "legacy" version decommissioned at September 26th, at which point the current beta will take over.
Userside there will now be an ECMWF account required instead of the previous separate CDS and ADS, as well as an update to the .cdsapirc file. You will also have to accept licences on the website again while logged in.

For more info check their website.

There's also a few datasets with cmorizer scripts this might affect. This could be either by changing the link to the data, or like it happened for me for #3749 the keys for the api request might change. Using the beta access, I also had to download the files as .zip and change some common functions to accept it because the current default to turn the downloaded files into .tar files gave me errors.

I'll recheck my new cmorizer (#3749) after the change, and can probably see if others are affected and need to change. So this is more of a heads-up to check your user side .

@valeriupredoi
Copy link
Contributor

thanks, Tina! I expect they will fold those API changes in the cds package too, so we'd expect API breakages when they release that new package

@bettina-gier
Copy link
Contributor Author

I just had a short look through the CDS webpage, and see what the api request it spits out looks like compared to our current downloaders.

  • cds_xch4, cds_uerra: all options present and accounted for, should be fine
  • cds_satellite_soil_moisture: currently set to version v201912 which had an issue reported of only NaNs between 21/11/2021 and 20/08/2023. We currently only downloaded up to 2020, but might be good to switch to the newer version (v202212) here.
  • cds_satellite_fapar, cds_satellite_albedo, cds_satellite_soil_moisture: Currently our script has a format: 'tgz', which no longer exists in the new version. Will need to check if this stops it from downloading or not (and what format the data will be now).

Biggest gripe atm: You're supposed to use the CDS API for both CDS and ADS content, but they have different endpoints. Questions also in their forum about how to use both urls at once in the .cdsapirc aren't being answered. I tried with a simple list but that just gave me a syntax error. So currently would need to manually change from one to the other depending on download, unless I'm stupid and there's a simple way. Forum seems to have a whole lot of complaints about the new API currently too. Probably wise to observe for a bit longer before making any changes.

@valeriupredoi
Copy link
Contributor

most useful @bettina-gier cheers 🍺

@bettina-gier
Copy link
Contributor Author

Update on the different endpoints in the CDS API:
In a forum post it is suggested to set the URL to an environment variable to be changed in the python script. If we want to follow this approach it would require changes to our downloader scripts for anything obtained through CDS. I would suggest adding the URLs to the CDS downloader script itself and have it take a key service (name debatable) which takes the values CDS/ADS/EWDS and internally assigns the url, this way if another change happens it will be a central fix to URLs instead of each individual downloader script.

Still unsure if we should try implementing this now or wait to see if there are any further developments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants