Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2048 add support for numpy 2 #2050

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Conversation

G-D-Petrov
Copy link
Collaborator

@G-D-Petrov G-D-Petrov commented Dec 6, 2024

Reference Issues/PRs

#2048

What does this implement or fix?

There quite a few distinct changes as most of the work was in the CI and testing.
I will try to separate the different changes here:

I've also tested in against the full suite of tests incl the persistent ones (run here)
N.B.: We need to continue using the submodule on pybind11 instead of getting it from vcpkg, because the latest version of pybind11 are not available on its vcpkg distribution.
Specifically, the versions that introduce the Numpy 2 Support.

Any other comments?

Checklist

Checklist for code changes...
  • Have you updated the relevant docstrings, documentation and copyright notice?
  • Is this contribution tested against all ArcticDB's features?
  • Do all exceptions introduced raise appropriate error messages?
  • Are API changes highlighted in the PR description?
  • Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

@G-D-Petrov G-D-Petrov force-pushed the 2048-add-support-for-numpy-2 branch 3 times, most recently from 504a15f to 20d68e9 Compare December 16, 2024 13:51
@G-D-Petrov G-D-Petrov marked this pull request as ready for review December 17, 2024 13:06
@G-D-Petrov G-D-Petrov force-pushed the 2048-add-support-for-numpy-2 branch from 8324df5 to 112ef00 Compare December 17, 2024 14:18
)

def _resolve_empty_columns(self, columns, implement_read_index):
if not implement_read_index and columns == []:
is_columns_empty = (
columns is None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used in order to implement this in the NativeVersionStore here

None should mean read all columns while empty list should mean read only the index. I don't think this will work correctly but the tests are passing. Either I'm missing something or we have bad tests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea about that, my change is mainly for supporting the case where columns is ndarray (isinstance(columns, np.ndarray) and columns.size == 0)) because some of the tests pass a numpy array to columns and Numpy 2 is not happy with the old check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was columns is None added? TBH I have no idea why is this working and not breaking these tests can you check to make sure we're not doing something wrong

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

columns is None was added because in the beginning I wanted to use just the len function to check for size but this wasn't working for ndarrays. Now it is more of a circuit breaker for the check, no need to check the types and so on if it is none, so it is not strictly needed but it is a bit of on optimization.

As for the tests, I am not sure why they shouldn't pass, they are passing an empty list([]) to the function and that part of the logic is not changed, just now it is covered by or (isinstance(columns, list) and len(columns) == 0)

@h-vetinari h-vetinari mentioned this pull request Dec 20, 2024
@@ -0,0 +1,2 @@
# Makes sure we are able to use Numpy 1
numpy<2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what matrix of Python (minor) and Pandas/numpy (major) versions are now covered by the CI?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following is the matrix on Linux based on the latest run in this branch:

Python Version NumPy Version Pandas Version
Python 3.7 1.21.6 1.3.5
Python 3.8 1.24.4 2.0.3
Python 3.8 (Compat) 1.24.4 1.5.3
Python 3.9 2.0.2 2.2.3
Python 3.10 2.2.1 2.2.3
Python 3.11 2.2.1 2.2.3
Python 3.11 (Compat) 1.26.4 2.2.3
Python 3.12 2.2.1 2.2.3

There is no Numpy 2 for python 3.8 and below.
Python 3.6 backwards/forwards compat can still be tested when the persistent tests are run.

@@ -119,14 +119,16 @@ struct NativeTensor {
[[nodiscard]] auto expanded_dim() const { return expanded_dim_; }
template<typename T>
const T *ptr_cast(size_t pos) const {
util::check(ptr != nullptr, "Unexpected null ptr in NativeTensor");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this no longer needed?

Copy link
Collaborator Author

@G-D-Petrov G-D-Petrov Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been added. It is a bit of a sanity check because if there is a incompatibility of the numpy version used at runtime vs compile time, the whole numpy object is misread and this pointer is null

phoebusm and others added 8 commits December 27, 2024 09:48
-Upgrade pybind in conda to 2.11.1
-Move pybind11 from submodule to vcpkg

-Reinstate pybind11 git submodule
-Bypass GIL check in some places

Bypass nan increase ref and decreas ref as well

Remove more py::none{} call

Skip GIL check in C++ test

Add support for numpy 2

Support numpy on conda

Remove Werror for testing

Fix deprecation warnings

Test fix for new pybind cmake checks

Try to disable FindPythonInterp and FindPythonLIbs

Try to set the python vars manually

remove policy

Fix nan related tests

Fix builds

test with other Utils

test

try to set extension manually

add endif

change

Test turning off the pybind11 override

Tests without the PythonUtils module

Try with latest pybind11

Test PYBIND11_NOPYTHON

Test with PYBIND11_FINDPYTHON

idk

Try to set vars for both PYTHON and Python prefix

Try setting both Python and PYTHON

try new pythonutils

test like this

Add support for both Numpy 1 and 2

Remove 3.6 and fix some tests

Totally remove 3.6

test with a new columns condition

Test new action steps

fix tests

Test with a python pin for conda tests

Test with 3.11

Test with newer changes

Test without calling PythonUtils twice

Test without PYBIND11_NO_ASSERT_GIL_HELD_INCREF_DECREF definition

Add compact tests for Numpy 1 and test newer python versions

test with latest cibuildwheel version

test with latest cibuildwheel

Test with newer cibuildwheel

test with http endpoint

Don't build for python 13

Disable compat tests for python 3.12 and 13 as we haven't published them yet

Fix integration tests

fix flaky test
@G-D-Petrov G-D-Petrov force-pushed the 2048-add-support-for-numpy-2 branch from adee4d7 to 6477529 Compare December 27, 2024 07:48
@willdealtry
Copy link
Collaborator

Are you sure that the encoding_version error has been properly understood? The values for the encoding_version are actually 0 (the default) and 1, even though we refer to it as version 1 and 2. It looks like it's complaining it has been given an invalid encoding version

@G-D-Petrov
Copy link
Collaborator Author

G-D-Petrov commented Dec 30, 2024

Are you sure that the encoding_version error has been properly understood? The values for the encoding_version are actually 0 (the default) and 1, even though we refer to it as version 1 and 2. It looks like it's complaining it has been given an invalid encoding version

@willdealtry yes, the older versions are performing this check that is checking specifically for version 0 aka 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants