Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Numpy2] Support for numpy==2.0.0 #2395

Open
wants to merge 63 commits into
base: main
Choose a base branch
from
Open

[Numpy2] Support for numpy==2.0.0 #2395

wants to merge 63 commits into from

Conversation

valeriupredoi
Copy link
Contributor

@valeriupredoi valeriupredoi commented Apr 12, 2024

Description

This PR introduces the changes we need to support Numpy 2.0 (these changes were collected from running the tests, so they may not be 100% exhaustive, but given that we have 95% test coverage, I'd say they are 95% exhaustive; we can deal with missed cases as they come along, if any; I would not use the ruff plugin test Numpy recommends to migrate, since that will introduce a lot of style changes that we may want to make when we include ruff as our linter). Most changes are in our testing suite, but there are a couple in the source code; these are backwards compatible changes, ie all tests pass when running with numpy=1.26.4

The only test that fails (both for Numpy2 and numpy legacy) is related to the pins on pandas, that need to go since we need to use the latest pandas for environment compatibility, see #2459 ; I'll try fix that issue here, since it's related to (but not a direct consequence of) Numpy2.

Closes #2460
Closes #2459

iris issue to monitor SciTools/iris#6043

-> which that is already in iris==3.11, which leads to

TODO pin iris>=3.11

The second issue closure is more of a brush it under the carpet via a pytest xfail, the original issue that kept creeping up is #2305 and the actual issue is with Pandas, and listed in pandas-dev/pandas#57002 - we can wait on its fix (and have us test quietness via the test xfail) or @schlunma can propose a more elegant solution in a separate PR?

Historical guff

OK @ESMValGroup/technical-lead-development-team we finally have a correct set of tests with the latest Numpy=2.0.0 (stable) - don't mind the ones related to pandas, I had to unpin it to gte pandas=2.2.2 built with Numpy 2.0 (otherwise the environment wouldn't solve) -> we have some work to do 👍 https://github.com/ESMValGroup/ESMValCore/actions/runs/9582595825/job/26421917358?pr=2395

Numpy 2.0 is impending: it is now fully out (see https://anaconda.org/conda-forge/numpy ); resources https://anaconda.org/conda-forge/numpy/files and https://github.com/numpy/numpy/releases - they recommend a test with the RC and list the changes roadmap in https://numpy.org/devdocs/numpy_2_0_migration_guide.html so we can test ourselves. Note: this branch should not be merged! If everything works out fine, we'll just pick up numpy 2.0 as per normal. ATTN: @ESMValGroup/technical-lead-development-team

  • seems the 2.0.0rc1 is a bit of a ninja - in Where is numpy=2.0.0rc1 on conda-forge? numpy/numpy#26271 it is reported that the channel is conda-forge/label/numpy_dev
  • moving to said conda channel results in conflicts for python-stratify, shapely, and pandas, since most prob they are pinning the upper bound of numpy to <2.0, see GA tests
  • a manual install is then needed; that results in mayhem - some of the fails are due to cftime, as @bouweandela points out in comments, see Numpy 2.0 incompatibility Unidata/cftime#318 (and linked PR), others are due to actual package compilation against a Numpy=1.x that barfs when it sees 2.x (those should be solved at recompile/redeploy time), and others are due to netCDF4 complaining about dtypes - that needs a closer look

Copy link

codecov bot commented Apr 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.77%. Comparing base (30951fa) to head (fb6cdc3).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2395   +/-   ##
=======================================
  Coverage   94.77%   94.77%           
=======================================
  Files         251      251           
  Lines       14266    14270    +4     
=======================================
+ Hits        13520    13524    +4     
  Misses        746      746           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@valeriupredoi
Copy link
Contributor Author

@bouweandela
Copy link
Member

xref Unidata/cftime#318

@valeriupredoi
Copy link
Contributor Author

Spot on @bouweandela 🍻 Fortunately there is a mature PR for it Unidata/cftime#319

@valeriupredoi
Copy link
Contributor Author

K then, just waiting for a new cftime to be released

@valeriupredoi valeriupredoi changed the title [DNM][Test Only] Test with numpy==2.0.0rc1 [DNM][Test Only] Test with numpy==2.0.0rc2 May 27, 2024
@valeriupredoi
Copy link
Contributor Author

still royally hosed even with rc2

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks V, looks really good already! I have a couple of comments, mostly about removing the comments if the change is fully backwards-compatible.

Comment on lines 247 to 249
# previous to numpy 2.0, a np.array(esmpy_lat.shape) was used
# numpy>=2.0 throws ValueError: matrix transpose with ndim<2 is undefined
grid = esmpy.Grid(np.vstack(esmpy_lat.shape),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backwards-compatible? If yes, do we need this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is, functionally and it returns the same sort of structure, only typed differently

Comment on lines 172 to 176
# explicitly casting to int32 since without it, in Numpy 2.0
# one gets OverflowError: Python integer 360 out of bounds for int8
# see notes https://numpy.org/devdocs/numpy_2_0_migration_guide.html
src_points_with_convex_hull[-2 * n_hull:-n_hull, 1] -= np.int32(360)
src_points_with_convex_hull[-n_hull:, 1] += np.int32(360)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this work with 360.0 instead of np.int32(360)? The points should be floats, not integers, so to me this makes much more sense. If this works, you can also remove the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, will test - but I agree about points' types indeed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nopers, a float there throws this:

>       src_points_with_convex_hull[-2 * n_hull:-n_hull, 1] -= 360.
E       numpy._core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'subtract' output from dtype('float64') to dtype('int8') with casting rule 'same_kind'

so we either np.int32 the 360 or we np.float the left member

esmvalcore/preprocessor/_derive/co2s.py Outdated Show resolved Hide resolved
esmvalcore/preprocessor/_time.py Outdated Show resolved Hide resolved
tests/integration/cmor/_fixes/icon/test_icon.py Outdated Show resolved Hide resolved
@@ -920,9 +920,12 @@ def test_season_not_available(self):
name='clim_season',
seasons=['JFMAMJ', 'JASOND'],
)

# numpy>=2.0 these need to be explicitly cast to numpy strings
two_seasons = [np.str_('JASOND'), np.str_('JFMAMJ')]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is not backwards-compatible? How does the error look like? It might make sense to adapt this in the actual code so the error message still looks nice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is, as it is now, ie the test does what it says on the lid for both numpy<2 and numpy>=2, here's the failed test for numpy>=2 if we don't change to forced typing:

>       with pytest.raises(ValueError, match=re.escape(msg)):
E       AssertionError: Regex pattern did not match.
E        Regex: "Seasons\\ \\('DJF',\\ 'MAM',\\ 'JJA',\\ 'SON'\\)\\ do\\ not\\ match\\ prior\\ season\\ extraction\\ \\['JASOND',\\ 'JFMAMJ'\\]\\."
E        Input: "Seasons ('DJF', 'MAM', 'JJA', 'SON') do not match prior season extraction [np.str_('JASOND'), np.str_('JFMAMJ')]."

tests/unit/preprocessor/_time/test_time.py:930: AssertionError

esmvalcore/preprocessor/_compare_with_refs.py Outdated Show resolved Hide resolved
@bouweandela
Copy link
Member

Maybe we should wait for Iris to be numpy 2 compatible: SciTools/iris#6035

@bouweandela
Copy link
Member

It looks like that will happen when Iris 3.11 is released at the earliest, planned in October: SciTools/iris#5571 (comment).

@valeriupredoi
Copy link
Contributor Author

many thanks for the reviews and comments @schlunma and @bouweandela 🍺 x2 I don't think we can wait until October TBF, I say we get this in and see what's the worst it can happen, and fairly sure not much, and if there are problems, we can alert iris and tell them to hurry up 😁

@bouweandela
Copy link
Member

I say we get this in and see what's the worst it can happen

I'm sorry, but I have to disagree. What would most likely happen is that users will get wrong results and/or crashes.

@valeriupredoi
Copy link
Contributor Author

I say we get this in and see what's the worst it can happen

I'm sorry, but I have to disagree. What would most likely happen is that users will get wrong results and/or crashes.

that is indeed the possibility; I guess there's one way to find out about the magnitude of that - run the iris tests with a numpy>=2 pin in an iris fork

@valeriupredoi
Copy link
Contributor Author

Monitoring SciTools/iris#6043

@valeriupredoi
Copy link
Contributor Author

OK folks, iris have implemented numpy2 support and closed SciTools/iris#6043 - their 3.11 is fast approaching, see https://github.com/SciTools/iris/milestone/51 - so could I please ask you to get back to reviewing this so we can plop it in as soon as the new iris is out? 🍺 The two review chevaliers: @schlunma @bouweandela 🎠

@valeriupredoi valeriupredoi mentioned this pull request Oct 30, 2024
5 tasks
@valeriupredoi valeriupredoi added the iris311 needs iris 3.11 stable release label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request iris311 needs iris 3.11 stable release Numpy2 testing
Projects
None yet
3 participants