-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update py03 from 0.20 to 0.21 #5566
Conversation
arrow/src/pyarrow.rs
Outdated
let pyarrow = PyModule::import(value.py(), "pyarrow")?; | ||
let class = pyarrow.getattr(expected)?; | ||
let pyarrow = PyModule::import_bound(value.py(), "pyarrow")?; | ||
let class = pyarrow.getattr(expected)?.into_gil_ref(); // TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with pyo3, so I was following the migration guide: https://pyo3.rs/v0.21.0/migration.html
Of particular note is:
PyO3 0.21 introduces a new
Bound<'py, T>
smart pointer which replaces the existing "GIL Refs" API to interact with Python objects. For example, in PyO3 0.20 the reference&'py PyAny
would be used to interact with Python objects. In PyO3 0.21 the updated type isBound<'py, PyAny>
.
So it seems they are moving away from &PyAny
is my understanding. This raises a question of if the PyArrow API's should follow suit?
e.g. here, introduce a new function that uses Bound
?
Line 141 in 5f63906
fn from_pyarrow(value: &PyAny) -> PyResult<Self> { |
For now I've used into_gil_ref()
to get around this deprecation process, but I don't think it's meant to be a permanent solution.
I'll do some more reading, maybe it's been discussed in pyo3 before somewhere 🤔
Perhaps @wjones127 or @kylebarron might be able to advise on this |
I haven't tried to upgrade my projects to 0.21 yet, but yes
that's my understanding as well. I'd say that the PyArrow API should probably follow suit. Maybe while upgrading to 0.21 is the best opportunity? |
Thanks for this PR :) Fyi, I pulled in this branch to update delta-rs (see delta-io/delta-rs#2363). When
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much @Jefffrey -- this code looks good to me, though it might be nice to get someone else more versed with python to give it another check (maybe @westonpace or @wjones127 ?)
As for the report from @abhiaagarwal (thanks for that ❤️ )
The FromPyObject trait gets disabled without gil-refs, leading to the error. (source) It should be able to be fixed by changing all the extract to extract_bound per the migration guide.
I think we could potentially fix it in a follow on PR as well (or we could fix it here too)
Ideally, we should also add a CI check/test for running without gil-refs
as well
This PR doesn't enable |
Hmm, I'm a bit confused by this, as we don't have the Am I missing something? |
arrow/src/pyarrow.rs
Outdated
if !value.is_instance(&class)? { | ||
let expected_module = class.getattr("__module__")?; | ||
let expected_module = expected_module.extract::<&str>()?; | ||
let expected_name = class.getattr("__name__")?; | ||
let expected_name = expected_name.extract::<&str>()?; | ||
let found_class = value.get_type(); | ||
let found_module = found_class.getattr("__module__")?.extract::<&str>()?; | ||
let found_name = found_class.getattr("__name__")?.extract::<&str>()?; | ||
let found_module = found_class.getattr("__module__")?; | ||
let found_module = found_module.extract::<&str>()?; | ||
let found_name = found_class.getattr("__name__")?; | ||
let found_name = found_name.extract::<&str>()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jefffrey these extract
s should be changed to extract_bound
. extract
only exists with gil-refs
enabled. I have no clue why the extract
version is passing CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, a bit of digging, and a public extract
actually does still exist, but it's been redirected to extract_bound
. https://github.com/PyO3/pyo3/blob/c1f11fb4bddc836f820e0db3db514b4742b8a3e7/src/types/any.rs#L805
I have no clue what's causing my build to freak out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also breaks for me when gil-refs
is not enabled. From this issue it appears that &str
lost FromPyObject
:
Finally, without the gil-refs feature we remove the existing implementation of FromPyObject for &str and instead implement this new trait. This works except for abi3 on old Python versions, where we have to settle for PyFunctionArgument only and allow users of .extract() to break.
This does implement some very targeted breakage for specific considerations, but I wonder if this is necessary as the correct way to users off the GIL Ref API in a case where otherwise it's near impossible for them to realise this uses it.
The reason it is not caught by CI is because CI does not have any abi...
feature enabled.
I think maybe wait for @abhiaagarwal to resolve their issue first to prove downstream consumers shouldn't be broken by this change? Just in case something was done wrong with this migration 🤔 |
worst case, delta-rs can keep |
I'll try and update pylance to pyo3 0.21 using this branch tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can wait until a future PR but I think this will need to get fixed upstream at some point. I'm not sure there is any workaround we do other than use gil-refs
. Changing the pyo3 dependency (in the integration test) like this:
pyo3 = { version = "0.21", features = ["extension-module", "abi3-py38"] }
will allow the issue to be reproduced.
arrow/src/pyarrow.rs
Outdated
if !value.is_instance(&class)? { | ||
let expected_module = class.getattr("__module__")?; | ||
let expected_module = expected_module.extract::<&str>()?; | ||
let expected_name = class.getattr("__name__")?; | ||
let expected_name = expected_name.extract::<&str>()?; | ||
let found_class = value.get_type(); | ||
let found_module = found_class.getattr("__module__")?.extract::<&str>()?; | ||
let found_name = found_class.getattr("__name__")?.extract::<&str>()?; | ||
let found_module = found_class.getattr("__module__")?; | ||
let found_module = found_module.extract::<&str>()?; | ||
let found_name = found_class.getattr("__name__")?; | ||
let found_name = found_name.extract::<&str>()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also breaks for me when gil-refs
is not enabled. From this issue it appears that &str
lost FromPyObject
:
Finally, without the gil-refs feature we remove the existing implementation of FromPyObject for &str and instead implement this new trait. This works except for abi3 on old Python versions, where we have to settle for PyFunctionArgument only and allow users of .extract() to break.
This does implement some very targeted breakage for specific considerations, but I wonder if this is necessary as the correct way to users off the GIL Ref API in a case where otherwise it's near impossible for them to realise this uses it.
The reason it is not caught by CI is because CI does not have any abi...
feature enabled.
Thanks for checking this 👍 I'll add the Edit: actually, I changed it to extract a This seems to work the same, and I test with enabling (Need |
Thanks @Jefffrey! I'll try a build by the end of the day with the new commit here and see if we can build. |
Feel free to raise a new issue if you still encounter problems with this @abhiaagarwal @westonpace 👍 |
will do, apologies, I didn't get a chance to run a new build :) I'll hopefully get to it soon. Though I'll wait on DataFusion to update its dependency as well. |
Just a note is you might be waiting a while as DataFusion will need to wait for arrow-rs to release a new version before it can bump it's own, I think? Feel free to chip in on discussion of release cadence for arrow-rs here #5368 |
Can someone confirm that this isn't a breaking API change to the arrow APIs? |
I think it may be classified as a breaking change in that it'll require consumers to upgrade their pyo3 version, but in terms of API itself we preserve the old API (and just deprecate it) |
Arrow needs to be upgraded alongside pyo3 because they both link to python. Arrow's upgrade to pyo3 is in master but not yet released as of v51. apache/arrow-rs#5566 NOTE: Arrow marked `ffi::from_ffi` unsafe. apache/arrow-rs#5080
Arrow needs to be upgraded alongside pyo3 because they both link to python. Arrow's upgrade to pyo3 is in master but not yet released as of v51. apache/arrow-rs#5566 NOTE: Arrow marked `ffi::from_ffi` unsafe. apache/arrow-rs#5080
Which issue does this PR close?
Closes #.
Rationale for this change
What changes are included in this PR?
Per guidance from pyo3 migration guide from 0.20 to 0.21:
https://pyo3.rs/v0.21.0/migration.html
Change usages of
&PyAny
toBound<PyAny>
and adjust the public traitFromPyArrow
accordingly with a new function that accepts a bound.Are there any user-facing changes?
Yes