-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multimodel regression tests don't fail, even though plev in sample data are inconsistent. #956
Comments
Good to see that this is picked up by the new multi-model functions! Initially I had an I would rather not code too many work-arounds in these tests. One way to solve it would be to ignore the problematic datasets here: |
Made an issue here: |
I wonder if the cmor checks/fixes pick this up if you run it normally in a recipe. In that case we might need to pull the sample data through the checks before using it in the tests. |
I don't know if the fixes work with this, but the risk is that we are re-implementing the entire tool just to get the tests to run. |
@Peter9192 can you pls point me to the branch and test that displays this behaviour mate? 🍺 |
It's already merged:
See this PR where the test was first implemented (#856) |
cheers, will have a look now! |
this is not an issue - those tests will never fail and they shouldn't either, the multimodel dictionary keyed by |
Yeah I see how it doesn't fail, but what if the deviations are larger? Say plev = [100000, 92500] on one cube and [100000, 95000] on the other. Does that pass as well? |
as long as the first cube in the multimodel cubes list has the same plev coordinate as the reference cube (or any other non-time coordinate) it will pass, and as I see it, it has |
if you ok with it, I think we can safely close this 👍 |
I don't think I fully understand your argumentation. The way I see it, if the plevs on the cubes are different, multimodel should not combine them. |
no no, multimodel is a blunt tool, if the data dimensions are not consistent across the cubes then yes, it will fail, that means a cube has 3 levels whereas all other have two levels, but if they don't differ then multimodel will work a-ok. Plus, remember that the MM cube is constructed based on the first cube (the reference one) in the cubes list, which in this case, has two levels and they are both nice int's. We could set up a test for levels that warn the user cubes have different values for levels but maybe the user actually wants to compute a MM regardless of what the levels values are? |
I guess that's the key question of this issue. IMHO I think we might take a stand here and (strongly) discourage this idea. It sounds like computing the mean over Amsterdam and London and then calling it the Amsterdam mean. Can you provide an example use case where one might want to do this? (This will help me determine the requirements for #950). |
If the error message coming from iris is not clear, I think it would be user friendly to have a check in the multimodel statististics function that the input dimensions are compatible. Note that there is a function for checking that points match the requested points from the CMOR table: ESMValCore/esmvalcore/cmor/check.py Line 633 in b632f9a
but it only issues a warning, not an error. Maybe it could be improved so it automatically fixes minute differences? @jvegasbsc Many observational datasets will be available on different vertical levels than the ones in the CMOR table though. If we do not create the automated fix, we may need to update this function a bit: ESMValCore/esmvalcore/preprocessor/_regrid.py Lines 529 to 533 in b632f9a
so it actually sets the vertical levels to the requested when they are almost equal, because that is how people will typically use the multimodel statistics functions, i.e. in combination with regrid and extract_levels and possibly regrid_time in the future.
|
when, back in 1947, when I built the multimodel module, its specifications did not contain a check on equality or allcloseness of the vertical level points, a mere data shapes consistency was deemed enough. I agree that the problem is nasty when you have a model with X number of levels close to the ground and another model with the same X number of levels up in the stratosphere, but there are very slim chances that this would happen. I am open to suggestions - either we create a dedicated multimodel checker or we go about as Bouwe explains above 👍 |
Thanks @bouweandela and @valeriupredoi ! I will come up with a proposal in #950 (or rather, I think I'll abandon that one in favour of a new PR). This context helps a lot :-) |
Sorry, @Peter9192, this issue has been stale for a while and My feeling is that it might depend on #968, which also will not make its way into 2.4.0, hence I will bump this to 2.5.0, but please feel free to correct me or to comment on a way forward. |
I think (but haven't checked) that this issue is solved by #1177 and we can close it. |
Thanks, I'll close it then. Please reopen if necessary. |
Describe the bug
In #950 I implemented regression tests for a new/alternative multi-cube statistics engine. It fails because the sample data has differering coordinates on plev: some of the cubes have many levels, most only two. But more importantly, there are 2 cubes that have values of
[100000.00000001, 92500.00000001]
for theair_pressure
coordinate.To my surprise, this doesn't get picked up by the current implementation, which happily uses the plev coordinate of the first input cube for the result. In this case it's safe to assume that the levels are supposed to be the same, but this may not always be the case. I would sleep better if ESMValTool would catch these kind of inconsistencies.
@stefsmeets @valeriupredoi any thoughts on this?
The text was updated successfully, but these errors were encountered: