-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: Race condition between wrapper deallocation and lookup #864
Comments
wjakob
pushed a commit
that referenced
this issue
Jan 20, 2025
There's a race condition between wrapper lookup and wrapper deallocation where a Python wrapper may be returned that's in the process of being deallocated. This commit fixes the issue (see #864 for further details).
wjakob
pushed a commit
that referenced
this issue
Jan 20, 2025
There's a race condition between wrapper lookup and wrapper deallocation where a Python wrapper may be returned that's in the process of being deallocated. This commit fixes the issue (see #864 for further details).
wjakob
pushed a commit
that referenced
this issue
Jan 20, 2025
There's a race condition between wrapper lookup and wrapper deallocation where a Python wrapper may be returned that's in the process of being deallocated. This commit fixes the issue (see #864 for further details).
Fixed via #865. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Problem description
There's a race condition between wrapper lookup and wrapper deallocation where a Python wrapper may be returned that's in the process of being deallocated. I have a reproducer for the free threading build. I think the problem can also affect the default (GIL-enabled) build as well, but I don't have a reproducer yet.
This the counterpart of the pybind11 bug:
Explanation
nb_type_put
will lookup an existing Python wrapper object ininst_c2p
. Theinst_dealloc
function removes the wrapper frominst2_cp
when the wrapper is deallocated.During the
nb_type_put
call, it's possible that the found wrapper has a reference count of 0 and is in the process of being deallocated, but not yet removed frominst2_cp
. In the free threading build, this can happen becausenb_type_put
can be run concurrently withinst_dealloc
up to the acquisition of the shard lock. I think this can also happen in the default (GIL-enabled) build, because things likePy_CLEAR(*dict)
can call arbitrary code that may temporarily release the GIL.Suggested fix
nb_type_put
should only incref and return a wrapper if the reference count is not zero. In the GIL-enabled build, this is roughly:In the free threading build, we'll want to use
PyUnstable_TryIncref
when it's available, or implement that logical like we're doing in pybind11.See also
_Py_TryIncref
public as an unstable API asPyUnstable_TryIncref()
python/cpython#128844Reproducible example code
The text was updated successfully, but these errors were encountered: