Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump pyarrow version to 16.0.0 #600

Conversation

JohnMoutafis
Copy link
Contributor

@JohnMoutafis JohnMoutafis commented Jul 11, 2024

Attempt to resolve the following issue:

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Reported by @jparismorgan and related to pyarrow version.
Related to this Vector Search PR

@JohnMoutafis JohnMoutafis self-assigned this Jul 11, 2024
@JohnMoutafis JohnMoutafis marked this pull request as ready for review July 11, 2024 12:20
@JohnMoutafis
Copy link
Contributor Author

JohnMoutafis commented Jul 11, 2024

@ihnorton @NikolaosPapailiou although the tests are passing, is there any edge case breakage that may be caused by this update?

pyproject.toml Outdated
@@ -13,7 +13,7 @@ dependencies = [
"importlib-metadata",
"packaging",
"pandas>=1.2.4",
"pyarrow>=3.0.0",
"pyarrow==15.0.0",
Copy link
Contributor

@jparismorgan jparismorgan Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this!

Though I think we may need to update to 16.0.0 instead?

Screenshot 2024-07-11 at 2 25 38 PM

(Unsure) Also, should it be >= rather than ==?

Copy link
Contributor Author

@JohnMoutafis JohnMoutafis Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't knew that, as both versions are supposedly working with numpy v1.16.1 and above... Will update
About the >= vs ==, I prefer having pinned dependencies and update them manually if needed, to ensure build stability and environment reproducibility

Copy link
Contributor

@jparismorgan jparismorgan Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean that users of the package will need to have pyarrow==16.0.0 exactly? What would happen if they depend on another package which, for example, requires pyarrow>=16.1.0 (which seems reasonable for another package to set if they depend on a bug fix from a later release)?

Or am I misunderstanding this and it's a build-time only requirement? And at runtime they can have whatever version they want? (I didn't think so b/c when I looked at this for Vector Search I though you'd use [build-system] requires to specify build time requirements (like here)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a valid point and I am not sure about that.
I will read about it and update you here!

Copy link
Contributor Author

@JohnMoutafis JohnMoutafis Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jparismorgan You were right about that, people using tiledb-cloud will need a minimum version of 16.0.0 but it will not block versions higher than that.
Nice catch!

@JohnMoutafis JohnMoutafis changed the title Bump pyarrow version to 15.0.0 Bump pyarrow version to 16.0.0 Jul 11, 2024
Copy link
Contributor

@jparismorgan jparismorgan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! Will test with Vector Search once released.

@JohnMoutafis JohnMoutafis merged commit 7f20a6b into main Jul 11, 2024
18 checks passed
@JohnMoutafis JohnMoutafis deleted the johnmoutafis/sc-50263/update-tiledb-cloud-python-package-to-fix branch July 11, 2024 14:23
JohnMoutafis added a commit that referenced this pull request Jul 12, 2024
Reverts #600 due to conflicts with numpy v2+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants