Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for native ERA5 data in GRIB format #2178

Open
wants to merge 82 commits into
base: main
Choose a base branch
from
Open

Conversation

schlunma
Copy link
Contributor

@schlunma schlunma commented Aug 23, 2023

Description

This PR allows ESMValCore to process native ERA5 data in GRIB format, which is for example available on Levante in the /pool/data/ERA5 directory.

Reading the data

The following settings are necessary in the user configuration file:

rootpath:
  ...
  native6:
    /pool/data/ERA5: DKRZ-ERA5-GRIB
  ...

I added an extra facets file which includes reasonable default for all supported variables. You can check it out here.

Thus, reading this data is as easy as

datasets:
  - {project: native6, dataset: ERA5, timerange: '2000/2001', short_name: tas, mip: Amon}
  - {project: native6, dataset: ERA5, timerange: '2000/2001', short_name: cl, mip: Amon, tres: 1H, frequency: 1hr}
  - {project: native6, dataset: ERA5, timerange: '2000/2001', short_name: ta, mip: Amon, type: fc, typeid: '12'}

Regridding

Native ERA5 data in GRIB format is on a reduced Gaussian grid (i.e., an unstructured grid). Thus, in 99% of the use cases, it is necessary to regrid this data, especially since no cell areas are available for the data (thus, we cannot even calculate global/regional statistics over the native data). This is done automatically by the CMORizer (as recommended by the ECMWF), but can be disabled in the recipe:

datasets:
  - {project: native6, dataset: ERA5, timerange: '2000/2001', short_name: tas, mip: Amon, automatic_regrid: false}

This PR depends on the following other PRs:


Closes #1991
Closes ESMValGroup/ESMValTool#3238

Link to documentation: https://esmvaltool--2178.org.readthedocs.build/projects/ESMValCore/en/2178/quickstart/find_data.html#supported-native-reanalysis-observational-datasets


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@schlunma schlunma added this to the v2.10.0 milestone Aug 23, 2023
@schlunma schlunma self-assigned this Aug 23, 2023
@codecov
Copy link

codecov bot commented Aug 23, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.95%. Comparing base (4526c84) to head (57369c9).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2178      +/-   ##
==========================================
+ Coverage   94.77%   94.95%   +0.18%     
==========================================
  Files         251      251              
  Lines       14266    14350      +84     
==========================================
+ Hits        13520    13626     +106     
+ Misses        746      724      -22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@schlunma
Copy link
Contributor Author

schlunma commented Aug 25, 2023

This is ready from my side, but there's two issues that need to be resolved before I mark this ready for review:

I tested this thoroughly with the following recipe: recipe_000.yml.txt

An example run is available on Levante here: /home/b/b309141/scratch/esmvaltool_output/recipe_000_20230825_080240

Note that with the default dask scheduler, this recipe ran into a timeout after 8 hours with 67/76 tasks finished. With the following dask configuration, I could run the same recipe on the same node (regular Levante compute node with 256 GiB of memory) in 5:27 min (!!) 🚀.

cluster:
  type: distributed.LocalCluster
  n_workers: 32
  threads_per_worker: 4
  memory_limit: 8 GiB

@ESMValGroup/technical-lead-development-team

@schlunma schlunma modified the milestones: v2.10.0, v2.11.0 Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants