Recipe test results for ESMValCore v2.11.0rc1 #2421

chrisbillowsMO · 2024-05-16T11:43:48Z

Recipe test results for v2.11.0rc1

This is the initial output from testing done for releasing ESMValCore v2.11.0rc1. Please see the following comment for our evaluation of the failures.

Recipe running session 2024-05-15

Setup

`mamba` version

levante5> mamba --version
mamba 1.5.8
conda 24.5.0

ESMValTool version

levante5> esmvaltool version
ESMValCore: 2.11.0rc1
ESMValTool: 2.11.0.dev75+g4734caf5a.d20240515

Recipes that ran successfully (132 out of 160)

Click to expand

recipe_albedolandcover.yml
recipe_anav13jclim.yml
recipe_arctic_ocean.yml
recipe_autoassess_landsurface_permafrost.yml
recipe_autoassess_landsurface_soilmoisture.yml
recipe_autoassess_landsurface_surfrad.yml
recipe_autoassess_stratosphere.yml
recipe_bock20jgr_fig_1-4.yml
recipe_bock20jgr_fig_6-7.yml
recipe_capacity_factor.yml
recipe_climate_change_hotspot.yml
recipe_climwip_brunner2019_med.yml
recipe_climwip_brunner20esd.yml
recipe_climwip_test_basic.yml
recipe_climwip_test_performance_sigma.yml
recipe_clouds_bias.yml
recipe_clouds_ipcc.yml
recipe_cmug_h2o.yml
recipe_concatenate_exps.yml
recipe_consecdrydays.yml
recipe_correlation.yml
recipe_cox18nature.yml
recipe_cvdp.yml
recipe_daily_era5.yml
recipe_deangelis15nat.yml
recipe_deangelis15nat_fig1_fast.yml
recipe_decadal.yml
recipe_diurnal_temperature_index.yml
recipe_eady_growth_rate.yml
recipe_ecs.yml
recipe_ecs_constraints.yml
recipe_ecs_scatter.yml
recipe_ensclus.yml
recipe_era5-land.yml
recipe_esacci_lst.yml
recipe_esacci_oc.yml
recipe_extract_shape.yml
recipe_extreme_index.yml
recipe_eyring06jgr.yml
recipe_flato13ipcc_figure_914.yml
recipe_flato13ipcc_figure_924.yml
recipe_flato13ipcc_figure_942.yml
recipe_flato13ipcc_figure_945a.yml
recipe_flato13ipcc_figure_96.yml
recipe_flato13ipcc_figure_98.yml
recipe_flato13ipcc_figures_926_927.yml
recipe_flato13ipcc_figures_92_95.yml
recipe_flato13ipcc_figures_938_941_cmip3.yml
recipe_flato13ipcc_figures_938_941_cmip6.yml
recipe_galytska23jgr.yml
recipe_gier2020bg.yml
recipe_globwat.yml
recipe_heatwaves_coldwaves.yml
recipe_hydro_forcing.yml
recipe_hype.yml
recipe_iht_toa.yml
recipe_impact.yml
recipe_ipccwg1ar6ch3_fig_3_42_b.yml
recipe_ipccwg1ar6ch3_fig_3_43.yml
recipe_ipccwg1ar6ch3_fig_3_9.yml
recipe_kcs.yml
recipe_landcover.yml
recipe_lauer13jclim.yml
recipe_lauer22jclim_fig1_clim.yml
recipe_lauer22jclim_fig1_clim_amip.yml
recipe_lauer22jclim_fig2_taylor.yml
recipe_lauer22jclim_fig2_taylor_amip.yml
recipe_lauer22jclim_fig6_interannual.yml
recipe_lauer22jclim_fig7_seas.yml
recipe_lauer22jclim_fig8_dyn.yml
recipe_lauer22jclim_fig9-11c_pdf.yml
recipe_li17natcc.yml
recipe_lisflood.yml
recipe_marrmot.yml
recipe_meehl20sciadv.yml
recipe_model_evaluation_basics.yml
recipe_model_evaluation_clouds_clim.yml
recipe_model_evaluation_clouds_cycles.yml
recipe_model_evaluation_precip_zonal.yml
recipe_modes_of_variability.yml
recipe_monitor.yml
recipe_monitor_with_refs.yml
recipe_mpqb_xch4.yml
recipe_multimodel_products.yml
recipe_my_personal_diagnostic.yml
recipe_ncl.yml
recipe_ocean_Landschuetzer2016.yml
recipe_ocean_amoc.yml
recipe_ocean_bgc.yml
recipe_ocean_example.yml
recipe_ocean_ice_extent.yml
recipe_ocean_multimap.yml
recipe_ocean_scalar_fields.yml
recipe_perfmetrics_CMIP5.yml
recipe_perfmetrics_CMIP5_4cds.yml
recipe_perfmetrics_land_CMIP5.yml
recipe_preprocessor_test.yml
recipe_psyplot.yml
recipe_pv_capacity_factor.yml
recipe_python.yml
recipe_python_for_CI.yml
recipe_quantilebias.yml
recipe_r.yml
recipe_radiation_budget.yml
recipe_rainfarm.yml
recipe_runoff_et.yml
recipe_russell18jgr.yml
recipe_schlund20jgr_gpp_abs_rcp85.yml
recipe_schlund20jgr_gpp_change_1pct.yml
recipe_schlund20jgr_gpp_change_rcp85.yml
recipe_sea_surface_salinity.yml
recipe_seaborn.yml
recipe_seaice.yml
recipe_seaice_drift.yml
recipe_seaice_feedback.yml
recipe_shapeselect.yml
recipe_smpi.yml
recipe_smpi_4cds.yml
recipe_snowalbedo.yml
recipe_spei.yml
recipe_tcr.yml
recipe_thermodyn_diagtool.yml
recipe_toymodel.yml
recipe_validation.yml
recipe_validation_CMIP6.yml
recipe_variable_groups.yml
recipe_weigel21gmd_figures_13_16.yml
recipe_wenzel14jgr.yml
recipe_wenzel16nat.yml
recipe_wflow.yml
recipe_williams09climdyn_CREM.yml
recipe_zmnam.yml

Recipes that failed because the diagnostic script failed (11 out of 160)

recipe_combined_indices.yml
recipe_extreme_events.yml
recipe_hyint.yml
recipe_hyint_extreme_events.yml
recipe_martin18grl.yml
recipe_miles_block.yml
recipe_miles_eof.yml
recipe_miles_regimes.yml
recipe_pcrglobwb.yml
recipe_schlund20esd.yml
recipe_wenzel16jclim.yml

Recipes that failed because of missing data (3 out of 160)

recipe_aod_aeronet_assess.yml
recipe_bock20jgr_fig_8-10.yml
recipe_check_obs.yml

Recipes that failed because the run took too long (6 out of 160)

recipe_carvalhais14nat.yml
recipe_eyring13jgr_12.yml
recipe_ipccwg1ar6ch3_fig_3_19.yml
recipe_ipccwg1ar6ch3_fig_3_42_a.yml
recipe_lauer22jclim_fig5_lifrac.yml
recipe_lauer22jclim_fig9-11ab_scatter.yml

Recipes that failed of other reasons or are still running (7 out of 160)

recipe_collins13ipcc.yml
recipe_easy_ipcc.yml
recipe_ipccwg1ar6ch3_atmosphere.yml
recipe_lauer22jclim_fig3-4_zonal.yml
recipe_ocean_quadmap.yml
recipe_preprocessor_derive_test.yml
recipe_tebaldi21esd.yml

Recipes that are known to be broken (1 out of 160)

recipe_julia.yml

The text was updated successfully, but these errors were encountered:

chrisbillowsMO · 2024-05-16T12:20:46Z

Hi @ESMValGroup/technical-lead-development-team @bouweandela @valeriupredoi

Any comments on the following evaluation please? (The original output from running the recipes for the first time is above).

1. R diagnostic failures

The following are R recipes with various errors. Would anyone with R knowledge please take a look?

recipe_combined_indices.yml (see Broken R recipes from v2.11.0 due to use of R v4.3.0 ESMValTool#3674)
recipe_extreme_events.yml (see Broken R recipes from v2.11.0 due to use of R v4.3.0 ESMValTool#3674)
recipe_hyint.yml (see Update the name of the remapcon2 operator in R recipes ESMValTool#3610)
recipe_hyint_extreme_events.yml (see Update the name of the remapcon2 operator in R recipes ESMValTool#3610)
recipe_miles_block.yml (see Broken R recipes from v2.11.0 due to use of R v4.3.0 ESMValTool#3674)
recipe_miles_eof.yml (see Broken R recipes from v2.11.0 due to use of R v4.3.0 ESMValTool#3674)
recipe_miles_regimes.yml (see Broken R recipes from v2.11.0 due to use of R v4.3.0 ESMValTool#3674)

The errors were either of the below:

Error in (models_dataset == reference_dataset) && (models_exp == reference_exp) :
  'length = 2' in coercion to 'logical(1)'

                     ^ Operator >remapcon2< not found!

2. Python diagnostic failures

We have the capacity to address these errors - should we? Or does anyone already know how to solve these?

recipe_martin18grl.yml (see Diagnostic failure for recipe_martin18grl.yml on v2.11.0rc1 #2424).

KeyError: 'Provenance record for /scratch/b/b382148/esmvaltool_output/recipe_martin18grl_20240515_142625/plots/spi_collect/spi_collect/SPI_time_series_Bremen_Observations.png already exists.'

recipe_pcrglobwb.yml (see Diagnostic failure for recipe_pcrglobwb.yml on v2.11.0rc1 #2425).

iris.exceptions.ConcatenateError: failed to concatenate into a single cube.
  Cube metadata differs for phenomenon: precipitation_flux

recipe_schlund20esd.yml (see Diagnostic failure for recipe_schlund20esd.yml on v2.11.0rc1 ESMValTool#3604).

TypeError: unhashable type: 'CubeAttrsDict'

3. NCL diagnostic failures

There is one NCL recipe with an error. Would anyone with NCL knowledge please take a look?

recipe_wenzel16jclim.yml (see Diagnostic failure for recipe_wenzel16jclim.yml on v2.11.0rc1 ESMValTool#3661)

INFO    fatal: in uajet_sh850, cannot read plev and latrange

4. Recipes that failed because of missing data

recipe_aod_aeronet_assess.yml (see Add CMORized AeroNET aerosol optical depth observations to the Tier 3 category ESMValTool#3067)
recipe_bock20jgr_fig_8-10.yml (see Missing data for recipe_bock20jgr_fig_8-10.yml ESMValTool#3659 and Fix recipe_bock20jgr_fig_8-10.yml ESMValTool#3665)
recipe_check_obs.yml (see Missing data on DKRZ for recipe_check_obs.yml ESMValTool#3660)

We recognise recipe_check_obs.yml is a known broken recipe but should we open a new issue to resolve the missing data issues with ESMValGroup/obs-maintainers?

5. Recipes that failed because the run took too long

recipe_eyring13jgr_12.yml
recipe_ipccwg1ar6ch3_fig_3_19.yml
recipe_ipccwg1ar6ch3_fig_3_42_a.yml
recipe_lauer22jclim_fig5_lifrac.yml

We've increased the time on all of these except for recipe_ipccwg1ar6ch3_fig_3_42_a.yml which was already at the maximum time. Is there anything we can do about this?

recipe_carvalhais14nat.yml
recipe_lauer22jclim_fig9-11ab_scatter.yml

We also had to increase time on these from the "Recipes that failed of other reasons or are still running" section.

6. Recipes that failed because model data couldn't be downloaded

recipe_easy_ipcc.yml
recipe_ocean_quadmap.yml

7. Recipes that failed because of an HDF5 error

recipe_collins13ipcc.yml
recipe_ipccwg1ar6ch3_atmosphere.yml
recipe_tebaldi21esd.yml

This three are all the same as in v2.10 recipe test results

recipe_preprocessor_derive_test.yml

This is a new entry.

8. Recipes that fail because of - we think! - an ESMValCore issue

recipe_lauer22jclim_fig3-4_zonal.yml (see Diagnostic failure for recipe_lauer22jclim_fig3-4_zonal.yml on v2.11.0rc1 #2427)

ValueError: Chunks and shape must be of the same length/dimension. Got chunks=(), shape=(1,)

valeriupredoi · 2024-05-16T14:22:21Z

great summary and work @chrisbillowsMO and @ehogan 🍺

Here is the issue with those three HDF5-related failures, as posted by @bouweandela back in December last year, when they were working on the 2.10 release: ESMValGroup/ESMValTool#3463 (comment)

This is a HDF5 thread unsafe-related issue and it is flaky but it appears it is mostly reproducible (positive flakiness, or was it negative? don't matter). This has to be fixed, most probably by adding a file lock() statement somewhere; I'll have a look myself, but don't set it as roadblock towards the release IMO

bouweandela · 2024-05-16T14:42:22Z

This Julia recipe has the following error:

recipe_rainfarm.yml

ERROR: LoadError: ArgumentError: Package YAML [ddb6d928-2868-570f-bddf-ab3f9cf99eb6] is required but does not seem to be installed:

Did you install the Julia dependencies?

valeriupredoi · 2024-05-16T15:16:23Z

fairly sure no is the answer to that q, bud 😁

ehogan · 2024-05-17T06:58:31Z

This Julia recipe has the following error:
recipe_rainfarm.yml
ERROR: LoadError: ArgumentError: Package YAML [ddb6d928-2868-570f-bddf-ab3f9cf99eb6] is required but does not seem to be installed:

Did you install the Julia dependencies?

No, I had missed the esmvaltool install Julia step. Both Julia recipes now succeed, so I will update the first and second comments to reflect this 👍

schlunma · 2024-05-17T08:20:35Z

10. Recipes that never ran
* recipe_schlund20jgr_gpp_abs_rcp85.yml

* recipe_schlund20jgr_gpp_change_1pct.yml

* recipe_schlund20jgr_gpp_change_rcp85.yml
These have been excluded from the generate.py script. @schlunma might you need to run these?

Successfully tested them 👍 I'll update the comment above to reflect this.

ehogan · 2024-05-17T09:58:03Z

5. Recipes that failed because the run took too long

recipe_climate_change_hotspot.yml

recipe_eyring06jgr.yml

recipe_eyring13jgr_12.yml

recipe_ipccwg1ar6ch3_fig_3_19.yml

recipe_ipccwg1ar6ch3_fig_3_42_a.yml

recipe_ipccwg1ar6ch3_fig_3_42_b.yml

recipe_lauer22jclim_fig5_lifrac.yml

We've increased the time on all of these except for recipe_ipccwg1ar6ch3_fig_3_42_a.yml which was already at the maximum time. Is there anything we can do about this?

recipe_carvalhais14nat.yml

recipe_lauer22jclim_fig9-11ab_scatter.yml

We also had to increase time on these from the "Recipes that failed of other reasons or are still running" section.

The following recipes are now running successfully, so I will update the comments above:

recipe_climate_change_hotspot.yml

2024-05-16 13:42:09,525 UTC [170675] INFO    Time for running the recipe was: 4:20:19.772793
2024-05-16 13:42:10,337 UTC [170675] INFO    Maximum memory used (estimate): 50.4 GB
[...]
2024-05-16 13:42:12,725 UTC [170675] INFO    Run was successful

recipe_eyring06jgr.yml

2024-05-16 14:24:00,524 UTC [88405] INFO    Time for running the recipe was: 4:58:26.498892
2024-05-16 14:24:01,288 UTC [88405] INFO    Maximum memory used (estimate): 97.0 GB
[...]
2024-05-16 14:24:01,415 UTC [88405] INFO    Run was successful

recipe_ipccwg1ar6ch3_fig_3_42_b.yml

2024-05-16 13:57:25,039 UTC [76122] INFO    Time for running the recipe was: 4:32:29.955802
2024-05-16 13:57:25,700 UTC [76122] INFO    Maximum memory used (estimate): 225.9 GB
[...]
2024-05-16 13:57:27,644 UTC [76122] INFO    Run was successful

Should I update the time for these recipes in SPECIAL_RECIPES in generate.py?

What should we do with the recipes that don't run within 8 hours?

ehogan · 2024-05-17T10:03:06Z

6. Recipes that failed because they used too much memory

recipe_model_evaluation_basics.yml

We've increased the memory on this one.

The following recipe is now running successfully, so I will update the comments above:

2024-05-16 09:28:34,122 UTC [86954] INFO    Time for running the recipe was: 0:01:42.672771
2024-05-16 09:28:34,977 UTC [86954] INFO    Maximum memory used (estimate): 73.2 GB
[...]
2024-05-16 09:28:35,092 UTC [86954] INFO    Run was successful

This is a new recipe since ESMValTool v2.10.0, so it will need adding to SPECIAL_RECIPES in generate.py.

ehogan · 2024-05-17T10:23:07Z

@bouweandela, @valeriupredoi, would it be possible to get some guidance on what to do now, please? How many of the failures above must we fix before moving onto the ESMValTool freeze and testing stages? Can all the diagnostic and data issues wait until ESMValTool testing? 🤔

valeriupredoi · 2024-05-17T11:10:46Z

Super work, guys! Here's me 3 cents (2 cents adjusted for inflation):

Julia example recipe is in the broken recipes list because the plot it produces is rubbish, see [Julia] Use NCDatasets instead of netCDF - masked values are treated as masked only in NCDatasets ESMValTool#3476
it'd be good to have a look at the recipes that failed due to diagnostic error - please add a link against each of those pointing to the output so we have an understanding if it's the same stuff from last release (in which case we should prob put those in the broken recipes) or if it's a new barf, in which case it'd need fixing
I'll have a looksee myself, but if it's a broken diagnostic because of diagnostic ie not because of some ESMValCore functionality, best to ask the diagnostic developers by tagging them here (if they are none, then let's see who'd know best)

schlunma · 2024-05-21T05:57:26Z

A possible reason for some of these failures could be iris' new attribute handling: since version 3.8, iris now distinguishes between local and global attributes. We adopted this new behavior in #2398.

This was the reason for the errors in recipe_schlund20esd.yml (fixed in ESMValGroup/ESMValTool#3605) and recipe_wenzel16jclim.yml (fixed in ESMValGroup/ESMValTool#3603).

ehogan · 2024-05-22T15:47:02Z

Super work, guys! Here's me 3 cents (2 cents adjusted for inflation):

Julia example recipe is in the broken recipes list because the plot it produces is rubbish, see [Julia] Use NCDatasets instead of netCDF - masked values are treated as masked only in NCDatasets ESMValTool#3476

Apologies @valeriupredoi, you did say this previously, and I promptly forgot! I will update the comment above appropriately 👍

valeriupredoi · 2024-05-22T16:18:47Z

Not a worry, Emma, release time is a very busy one 🙂

bouweandela · 2024-05-23T07:10:50Z

@bouweandela, @valeriupredoi, would it be possible to get some guidance on what to do now, please? How many of the failures above must we fix before moving onto the ESMValTool freeze and testing stages? Can all the diagnostic and data issues wait until ESMValTool testing? 🤔

If you suspect it is an ESMValCore issue, I would recommend fixing it before moving on to testing ESMValTool, but otherwise you should be fine to move on.

Should I update the time for these recipes in SPECIAL_RECIPES in generate.py?

Yes, that would be helpful for the next release manager.

What should we do with the recipes that don't run within 8 hours?

Are these recipes still running after 8 hours? In my experience, sometimes processes get killed without SLURM telling you. If there are no more log messages in the debug log or diagnostic scripts logs long before the 8 hours are over, it seems likely that the process has silently crashed. If this is the case, you could try reducing the number of workers used by Dask. This can be done by configuring the distributed scheduler, or if there are non-lazy preprocessor functions #674 in the recipe, you can use the default scheduler and create a file called ~/.config/dask/dask.yml and put

num_workers: 16

in it. That will use just 16 threads instead of the default 128 on a default levante compute node, leaving 256GB/16 = 16GB of RAM per thread instead of just 2GB.

ehogan · 2024-07-01T13:41:26Z

Closing this issue in favour of #2468 😊

chrisbillowsMO added the release label May 16, 2024

chrisbillowsMO added this to the v2.11.0 milestone May 16, 2024

chrisbillowsMO self-assigned this May 16, 2024

schlunma mentioned this issue May 21, 2024

Fixed attribute handling in austral_jet/main.ncl for iris>=3.8 ESMValGroup/ESMValTool#3603

Merged

12 tasks

ehogan mentioned this issue May 22, 2024

Update the name of the remapcon2 operator in R recipes ESMValGroup/ESMValTool#3610

Closed

mo-gill mentioned this issue May 24, 2024

Recipe testing and comparison for release 2.11.0 ESMValGroup/ESMValTool#3616

Closed

This was referenced Jun 21, 2024

Broken R recipes from v2.11.0 due to use of R v4.3.0 ESMValGroup/ESMValTool#3674

Closed

[v2.11.0 release] CMIP5 Omon CESM1 data on DKRZ has gone walkies 😕 ESMValGroup/ESMValTool#3693

Open

ehogan closed this as completed Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recipe test results for ESMValCore v2.11.0rc1 #2421

Recipe test results for ESMValCore v2.11.0rc1 #2421

chrisbillowsMO commented May 16, 2024 •

edited by ehogan

Loading

chrisbillowsMO commented May 16, 2024 •

edited by ehogan

Loading

valeriupredoi commented May 16, 2024 •

edited

Loading

bouweandela commented May 16, 2024

valeriupredoi commented May 16, 2024

ehogan commented May 17, 2024 •

edited

Loading

schlunma commented May 17, 2024 •

edited

Loading

10. Recipes that never ran

ehogan commented May 17, 2024

5. Recipes that failed because the run took too long

ehogan commented May 17, 2024

6. Recipes that failed because they used too much memory

ehogan commented May 17, 2024

valeriupredoi commented May 17, 2024

schlunma commented May 21, 2024

ehogan commented May 22, 2024 •

edited

Loading

valeriupredoi commented May 22, 2024

bouweandela commented May 23, 2024 •

edited

Loading

ehogan commented Jul 1, 2024

Recipe test results for ESMValCore v2.11.0rc1 #2421

Recipe test results for ESMValCore v2.11.0rc1 #2421

Comments

chrisbillowsMO commented May 16, 2024 • edited by ehogan Loading

Recipe test results for v2.11.0rc1

Recipe running session 2024-05-15

Setup

mamba version

ESMValTool version

Recipes that ran successfully (132 out of 160)

Recipes that failed because the diagnostic script failed (11 out of 160)

Recipes that failed because of missing data (3 out of 160)

Recipes that failed because the run took too long (6 out of 160)

Recipes that failed of other reasons or are still running (7 out of 160)

Recipes that are known to be broken (1 out of 160)

chrisbillowsMO commented May 16, 2024 • edited by ehogan Loading

1. R diagnostic failures

2. Python diagnostic failures

3. NCL diagnostic failures

4. Recipes that failed because of missing data

5. Recipes that failed because the run took too long

6. Recipes that failed because model data couldn't be downloaded

7. Recipes that failed because of an HDF5 error

8. Recipes that fail because of - we think! - an ESMValCore issue

valeriupredoi commented May 16, 2024 • edited Loading

bouweandela commented May 16, 2024

valeriupredoi commented May 16, 2024

ehogan commented May 17, 2024 • edited Loading

schlunma commented May 17, 2024 • edited Loading

10. Recipes that never ran

ehogan commented May 17, 2024

5. Recipes that failed because the run took too long

ehogan commented May 17, 2024

6. Recipes that failed because they used too much memory

ehogan commented May 17, 2024

valeriupredoi commented May 17, 2024

schlunma commented May 21, 2024

ehogan commented May 22, 2024 • edited Loading

valeriupredoi commented May 22, 2024

bouweandela commented May 23, 2024 • edited Loading

ehogan commented Jul 1, 2024

chrisbillowsMO commented May 16, 2024 •

edited by ehogan

Loading

`mamba` version

chrisbillowsMO commented May 16, 2024 •

edited by ehogan

Loading

valeriupredoi commented May 16, 2024 •

edited

Loading

ehogan commented May 17, 2024 •

edited

Loading

schlunma commented May 17, 2024 •

edited

Loading

ehogan commented May 22, 2024 •

edited

Loading

bouweandela commented May 23, 2024 •

edited

Loading