-
Notifications
You must be signed in to change notification settings - Fork 128
Freeze PyO3 wrappers & introduce interior mutability to avoid PyO3 borrow errors #1253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ock and Mutex for interior mutability
…, dataframe, and conditional expression modules
Cherry pick from #1252
Updated PR description |
#[pyclass]
)- Added CaseBuilderHandle guard that keeps the underlying CaseBuilder alive while holding the mutex and restores it on drop - Updated when, otherwise, and end methods to operate through the guard and consume the builder explicitly - This prevents transient None states during concurrent access and improves thread safety
…ency - Released Config read guard before converting values to Python objects in get and get_all - Ensures locks are held only while collecting scalar entries, not during expensive Python object conversion - Added regression test that runs Config.get_all and Config.set concurrently to guard against read/write contention regressions - Improves overall performance by reducing lock contention in multi-threaded scenarios
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible work here! The only part I think is in any way controversial is the case builder, which I think is well reasoned. I have just a few suggestions and CI has a couple of issues. Otherwise it looks great.
python/tests/test_expr.py
Outdated
|
||
# Avoid passing boolean literals positionally (FBT003). Use a named constant | ||
# so linters don't see a bare True/False literal in a function call. | ||
_TRUE = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this reduces code readability. Instead we can whitelist functions like lit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitelist functions like lit
I'll implement this.
python/tests/test_pyclass_frozen.py
Outdated
assert not unfrozen, ( | ||
"Found pyclasses missing `frozen`; add them to the allowlist only with a " | ||
"justification comment and follow-up plan:\n" + | ||
"\n".join( | ||
f"- {pyclass.module}.{pyclass.name} (defined in {pyclass.source})" | ||
for pyclass in unfrozen | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love these tests. Well done!
pub case_builder: CaseBuilder, | ||
struct CaseBuilderHandle<'a> { | ||
guard: MutexGuard<'a, Option<CaseBuilder>>, | ||
builder: Option<CaseBuilder>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the builder
in here need to be Option<>
? From reviewing the code below it seems like it must be Some
unless I missed something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Option wrapper is needed because into_inner takes ownership of the builder; clearing the field to None ensures the Drop implementation doesn’t try to reinsert or drop the same CaseBuilder a second time when the handle is consumed.
…ed constants for improved linting compliance
Which issue does this PR close?
pyclass
es frozen if possible. #1250Rationale for this change
Several Python-facing PyO3 wrappers in this crate exposed APIs that required mutable borrows (e.g.
&mut self
) even though the underlying Rust objects are shared viaArc
and intended for concurrent / re-entrant use from Python. When Python code holds onto those objects or uses them from multiple threads, PyO3 must hand out aPyRefMut
, which triggers runtime "Already borrowed" panics when re-entrancy or cross-thread access occurs.To make these thin wrappers behave like
SessionContext
(which is already#[pyclass(frozen)]
and safe to share concurrently), this PR marks many#[pyclass]
types asfrozen
and replaces exposed mutable borrows with methods that use interior mutability (Arc<parking_lot::RwLock<_>>
/Arc<parking_lot::Mutex<_>>
) for the small mutable state those types actually need.This addresses the root cause of runtime borrow errors and allows Python code to keep references to wrappers and call methods concurrently across threads.
What changes are included in this PR?
High-level summary
parking_lot
(lightweight locking primitives) toCargo.toml
andCargo.lock
and useparking_lot::{RwLock, Mutex}
to implement interior mutability for Python-exposed wrappers.#[pyclass(...)]
definitions are updated to#[pyclass(..., frozen)]
so PyO3 will not requirePyRefMut
for access.&mut self
method signatures with&self
where the method no longer needs exclusive borrow, and use internal locks to mutate internal state.#[pyo3(get, set)]
fields that were replaced with interior-mutable fields (examples:SqlSchema
now exposes getters/setters that lock and clone/replace contents).CaseBuilder
wrapper to useArc<Mutex<Option<CaseBuilder>>>
and adopt a take/restore pattern so multiple calls (including calls that return errors) preserve the builder's internal state while making the wrapper safe for concurrent use.PyConfig
to holdArc<RwLock<ConfigOptions>>
and modifyget
,set
,get_all
, and__repr__
to use read/write locks appropriately.PyDataFrame
caching to useArc<Mutex<Option<(Vec<RecordBatch>, bool)>>>
and update repr/html methods to use interior locks. Methods that produced&mut self
are now&self
.PyRecordBatchStream::next
/__next__
take&self
instead of&mut self
.frozen
to ensure the wrappers can be shared safely from Python.Notable changed files (non-exhaustive)
Cargo.toml
,Cargo.lock
— addparking_lot = "0.12"
.python/tests/test_concurrency.py
— new concurrency tests exercisingSqlSchema
,Config
, andDataFrame
from multiple threads.python/tests/test_expr.py
— updates and additions to testCaseBuilder
behavior and avoid boolean literal linter issue.src/common/schema.rs
—SqlSchema
now usesArc<RwLock<...>>
forname
,tables
,views
, andfunctions
and provides getters/setters.src/config.rs
—PyConfig
holdsArc<RwLock<ConfigOptions>>
, methods updated.src/dataframe.rs
— internal caching refactored toArc<Mutex<...>>
, repr methods made&self
.src/expr/conditional_expr.rs
—PyCaseBuilder
changed toArc<Mutex<Option<CaseBuilder>>>
and methods adapted to take/store the builder safely;case
/when
constructors updated.src/functions.rs
—case
andwhen
functions return the newPyCaseBuilder
wrapper viainto()
.src/*.rs
files —#[pyclass(..., frozen)]
added to many thin wrappers and enums so they no longer require&mut self
usage in Python.Behavioral notes
CaseBuilder
API:when
,otherwise
, andend
now returnPyDataFusionResult
and preserve the builder state on both success and failure (the builder is stored back into the wrapper after an attempted operation). This preserves the semantics that Python code can reuse the builder and that errors don't irreversibly consume the builder.Are these changes tested?
Yes: a new Python test (
python/tests/test_concurrency.py
) exercises several wrappers concurrently across threads to reproduce the race/borrow conditions and assert expected behavior. Additional tests for case builder correctness and state preservation were added/updated inpython/tests/test_expr.py
.Please run the full test suite (Rust + Python tests) in CI. I recommend running
maturin
/pytest
for the python bindings andcargo test
for Rust tests locally or in CI.Are there any user-facing changes?
API surface: The public Python API remains compatible at the call-site level for typical consumers — method names and signatures are preserved for end users. Some methods that previously required a mutable borrow at the PyO3 layer now accept
&self
from Python code; this is a transparent improvement for callers.Potential incompatibilities: Marking classes
#[pyclass(frozen)]
changes how PyO3 exposes attribute mutation at the Python attribute level. Any user code that relied on obtaining a mutable reference (PyRefMut
) and mutating the wrapper directly (rather than using the provided setters/methods) may no longer work. The intended mutation points are now exposed via explicit setters/methods (for exampleSqlSchema.set_name
,SqlSchema.set_tables
, etc.). Please review the PR for any specific wrappers your code depends on and adjust to call the explicit setters or APIs provided.Risk & compatibility
The changes are focused and low-level: they replace external mutation with interior mutability and mark wrappers as
frozen
.Risk areas that need careful review:
CaseBuilder
remain intuitive; errors should not permanently consume builder state.Arc<...>
wrappers.Notes for reviewers
CaseBuilder
take/restore logic correctly preserves state on both success and failure paths.#[pyclass(frozen)]
change does not break a previously intended API (particularly for types previously annotated with#[pyo3(get, set)]
). If a previously writable attribute was necessary, confirm the PR provides an explicit setter or alternate API.get
/get_all
/set
onPyConfig
behave as before and that conversion to/from Python objects remains correct.parking_lot
(no poisoning semantics, faster locking). Confirm this dependency is acceptable for the project and CI builds. For context, Datafusion already usesparking_lot
.std::sync::{Arc, Mutex}
) we can adjust, butparking_lot
offers simpler ergonomics and avoids poisoning semantics.