Multimodel regression tests don't fail, even though plev in sample data are inconsistent. #956

Peter9192 · 2021-01-20T16:37:45Z

Describe the bug
In #950 I implemented regression tests for a new/alternative multi-cube statistics engine. It fails because the sample data has differering coordinates on plev: some of the cubes have many levels, most only two. But more importantly, there are 2 cubes that have values of [100000.00000001, 92500.00000001] for the air_pressure coordinate.

To my surprise, this doesn't get picked up by the current implementation, which happily uses the plev coordinate of the first input cube for the result. In this case it's safe to assume that the levels are supposed to be the same, but this may not always be the case. I would sleep better if ESMValTool would catch these kind of inconsistencies.

@stefsmeets @valeriupredoi any thoughts on this?

The text was updated successfully, but these errors were encountered:

stefsmeets · 2021-01-21T08:37:22Z

Good to see that this is picked up by the new multi-model functions! Initially I had an extract_levels function in the preprocessor for these tests, but I took them out, because apparently these are set by the CMOR standards. So I assumed all to be the same.

I would rather not code too many work-arounds in these tests. One way to solve it would be to ignore the problematic datasets here: https://github.com/ESMValGroup/ESMValTool_sample_data/blob/master/datasets.yml.

stefsmeets · 2021-01-21T09:41:35Z

Made an issue here:
ESMValGroup/ESMValTool_sample_data#19

Peter9192 · 2021-01-21T11:03:12Z

I wonder if the cmor checks/fixes pick this up if you run it normally in a recipe. In that case we might need to pull the sample data through the checks before using it in the tests.

stefsmeets · 2021-01-21T11:40:50Z

I don't know if the fixes work with this, but the risk is that we are re-implementing the entire tool just to get the tests to run.

valeriupredoi · 2021-01-21T11:58:38Z

@Peter9192 can you pls point me to the branch and test that displays this behaviour mate? 🍺

Peter9192 · 2021-01-21T13:18:37Z

can you pls point me to the branch and test that displays this behaviour mate? 🍺

It's already merged:

ESMValCore/tests/sample_data/multimodel_statistics/test_multimodel.py

Line 168 in a285ee6

def test_multimodel_regression_month(timeseries_cubes_month, span):

and my question is why doesn't this fail, and should it?

See this PR where the test was first implemented (#856)

valeriupredoi · 2021-01-21T13:24:05Z

cheers, will have a look now!

valeriupredoi · 2021-01-21T14:07:43Z

this is not an issue - those tests will never fail and they shouldn't either, the multimodel dictionary keyed by statistic = "mean" (and btw maybe you should allow for "median" etc as well) grabs the multimodel cube, which will always have the non-temporal coordinates of the first cube in the cubes list on which the multimodel is computed on, in this case a cube with a nice round plev set of points

Peter9192 · 2021-01-21T14:18:02Z

Yeah I see how it doesn't fail, but what if the deviations are larger? Say plev = [100000, 92500] on one cube and [100000, 95000] on the other. Does that pass as well?

valeriupredoi · 2021-01-21T14:21:09Z

as long as the first cube in the multimodel cubes list has the same plev coordinate as the reference cube (or any other non-time coordinate) it will pass, and as I see it, it has

valeriupredoi · 2021-01-25T10:37:44Z

if you ok with it, I think we can safely close this 👍

Peter9192 · 2021-01-25T12:59:37Z

I don't think I fully understand your argumentation. The way I see it, if the plevs on the cubes are different, multimodel should not combine them.

valeriupredoi · 2021-01-25T17:28:19Z

no no, multimodel is a blunt tool, if the data dimensions are not consistent across the cubes then yes, it will fail, that means a cube has 3 levels whereas all other have two levels, but if they don't differ then multimodel will work a-ok. Plus, remember that the MM cube is constructed based on the first cube (the reference one) in the cubes list, which in this case, has two levels and they are both nice int's. We could set up a test for levels that warn the user cubes have different values for levels but maybe the user actually wants to compute a MM regardless of what the levels values are?

Peter9192 · 2021-01-26T08:22:43Z

maybe the user actually wants to compute a MM regardless of what the levels values are?

I guess that's the key question of this issue. IMHO I think we might take a stand here and (strongly) discourage this idea. It sounds like computing the mean over Amsterdam and London and then calling it the Amsterdam mean. Can you provide an example use case where one might want to do this? (This will help me determine the requirements for #950).

bouweandela · 2021-01-27T16:50:01Z

If the error message coming from iris is not clear, I think it would be user friendly to have a check in the multimodel statististics function that the input dimensions are compatible.

Note that there is a function for checking that points match the requested points from the CMOR table:

ESMValCore/esmvalcore/cmor/check.py

Line 633 in b632f9a

if point not in coord_points:

but it only issues a warning, not an error. Maybe it could be improved so it automatically fixes minute differences? @jvegasbsc Many observational datasets will be available on different vertical levels than the ones in the CMOR table though.

If we do not create the automated fix, we may need to update this function a bit:

ESMValCore/esmvalcore/preprocessor/_regrid.py

Lines 529 to 533 in b632f9a

    
           if (src_levels.shape == levels.shape 
        
                   and np.allclose(src_levels.points, levels)): 
        
               # Only perform vertical extraction/interploation if the source 
        
               # and target levels are not "similar" enough. 
        
               result = cube

so it actually sets the vertical levels to the requested when they are almost equal, because that is how people will typically use the multimodel statistics functions, i.e. in combination with regrid and extract_levels and possibly regrid_timein the future.

valeriupredoi · 2021-01-27T17:00:46Z

when, back in 1947, when I built the multimodel module, its specifications did not contain a check on equality or allcloseness of the vertical level points, a mere data shapes consistency was deemed enough. I agree that the problem is nasty when you have a model with X number of levels close to the ground and another model with the same X number of levels up in the stratosphere, but there are very slim chances that this would happen. I am open to suggestions - either we create a dedicated multimodel checker or we go about as Bouwe explains above 👍

Peter9192 · 2021-01-28T08:34:08Z

Thanks @bouweandela and @valeriupredoi ! I will come up with a proposal in #950 (or rather, I think I'll abandon that one in favour of a new PR). This context helps a lot :-)

zklaus · 2021-10-12T09:27:56Z

Sorry, @Peter9192, this issue has been stale for a while and My feeling is that it might depend on #968, which also will not make its way into 2.4.0, hence I will bump this to 2.5.0, but please feel free to correct me or to comment on a way forward.

Peter9192 · 2021-10-13T07:35:02Z

I think (but haven't checked) that this issue is solved by #1177 and we can close it.

zklaus · 2021-10-13T08:15:08Z

Thanks, I'll close it then. Please reopen if necessary.

Peter9192 mentioned this issue Jan 20, 2021

Add lazy 'engine' for multicube statistics #950

Closed

10 tasks

stefsmeets mentioned this issue Jan 21, 2021

Some datasets have deviating plev coordinates ESMValGroup/ESMValTool_sample_data#19

Open

stefsmeets added the bug Something isn't working label Jan 22, 2021

stefsmeets mentioned this issue Feb 25, 2021

Lazy implementation of multi_model_statistics and ensemble_statistics preprocessors #968

Merged

9 tasks

bouweandela mentioned this issue Jun 14, 2021

Automatically fix small deviations in vertical levels #1177

Merged

9 tasks

Peter9192 mentioned this issue Jun 29, 2021

New multimodel module can be slow and buggy for certain recipes #1201

Closed

zklaus added this to the v2.4.0 milestone Jul 2, 2021

bouweandela mentioned this issue Jul 9, 2021

Remove datasets with inconsistent levels ESMValGroup/ESMValTool_sample_data#21

Closed

zklaus modified the milestones: v2.4.0, v2.5.0 Oct 12, 2021

zklaus removed this from the v2.5.0 milestone Oct 13, 2021

zklaus added this to the v2.4.0 milestone Oct 13, 2021

zklaus closed this as completed Oct 13, 2021

schlunma mentioned this issue May 19, 2023

Allowed ignoring of scalar time coords in multi_model_statistics #1961

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multimodel regression tests don't fail, even though plev in sample data are inconsistent. #956

Multimodel regression tests don't fail, even though plev in sample data are inconsistent. #956

Peter9192 commented Jan 20, 2021 •

edited

Loading

stefsmeets commented Jan 21, 2021 •

edited

Loading

stefsmeets commented Jan 21, 2021

Peter9192 commented Jan 21, 2021

stefsmeets commented Jan 21, 2021

valeriupredoi commented Jan 21, 2021

Peter9192 commented Jan 21, 2021

valeriupredoi commented Jan 21, 2021

valeriupredoi commented Jan 21, 2021

Peter9192 commented Jan 21, 2021

valeriupredoi commented Jan 21, 2021

valeriupredoi commented Jan 25, 2021

Peter9192 commented Jan 25, 2021

valeriupredoi commented Jan 25, 2021 •

edited

Loading

Peter9192 commented Jan 26, 2021

bouweandela commented Jan 27, 2021

valeriupredoi commented Jan 27, 2021

Peter9192 commented Jan 28, 2021

zklaus commented Oct 12, 2021

Peter9192 commented Oct 13, 2021

zklaus commented Oct 13, 2021

Multimodel regression tests don't fail, even though plev in sample data are inconsistent. #956

Multimodel regression tests don't fail, even though plev in sample data are inconsistent. #956

Comments

Peter9192 commented Jan 20, 2021 • edited Loading

stefsmeets commented Jan 21, 2021 • edited Loading

stefsmeets commented Jan 21, 2021

Peter9192 commented Jan 21, 2021

stefsmeets commented Jan 21, 2021

valeriupredoi commented Jan 21, 2021

Peter9192 commented Jan 21, 2021

valeriupredoi commented Jan 21, 2021

valeriupredoi commented Jan 21, 2021

Peter9192 commented Jan 21, 2021

valeriupredoi commented Jan 21, 2021

valeriupredoi commented Jan 25, 2021

Peter9192 commented Jan 25, 2021

valeriupredoi commented Jan 25, 2021 • edited Loading

Peter9192 commented Jan 26, 2021

bouweandela commented Jan 27, 2021

valeriupredoi commented Jan 27, 2021

Peter9192 commented Jan 28, 2021

zklaus commented Oct 12, 2021

Peter9192 commented Oct 13, 2021

zklaus commented Oct 13, 2021

Peter9192 commented Jan 20, 2021 •

edited

Loading

stefsmeets commented Jan 21, 2021 •

edited

Loading

valeriupredoi commented Jan 25, 2021 •

edited

Loading