Skip to content

Commit

Permalink
BackgroundTask binding : Release GIL when destroying BackgroundTask
Browse files Browse the repository at this point in the history
The BackgroundTask constructor calls `cancelAndWait()`, and a Python function can't be cancelled unless it is able to aquire the GIL. Thus if we don't release the GIL before calling `wait()`, we can deadlock. We were already releasing the GIL appropriately for tasks returned by `ParallelAlgo::callOnBackgroundThread()` but not for those constructed with `BackgroundTask()` directly.

Fixing this revealed another problem that was dealt with correctly by `callOnBackgroundThread()` but not by the `BackgroundTask()` constructor. We need to hold the GIL when destroying the Python function owned by the task, and we were doing this by manually deleting it after it ran, at a point when we knew we had the GIL. But this wasn't sufficient if the BackgroundTask was cancelled before the function even started. In this case, the function would be destroyed from C++ and we would be deleting Python objects without holding the GIL. The solution is again to adopt the GIL management used by `callOnBackgroundThread()`.
  • Loading branch information
johnhaddon committed Dec 8, 2023
1 parent 54d794d commit ca37126
Show file tree
Hide file tree
Showing 3 changed files with 60 additions and 26 deletions.
5 changes: 5 additions & 0 deletions Changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,11 @@ Improvements
- TaskList, FrameMask : Reimplemented in C++ for improved performance.
- Cache : Increased default computation cache size to 8Gb. Call `Gaffer.ValuePlug.setCacheMemoryLimit()` from a startup file to override this.

Fixes
-----

- BackgroundTask : Fixed potential deadlock caused by destroying a BackgroundTask from Python while it was still running.

API
---

Expand Down
21 changes: 21 additions & 0 deletions python/GafferTest/BackgroundTaskTest.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
##########################################################################

import functools
import threading
import time

import IECore
Expand Down Expand Up @@ -234,5 +235,25 @@ def f( canceller ) :

t.cancelAndWait()

def testDestroyWhileRunning( self ) :

started = threading.Event()

def f( canceller ) :

started.set()
while True :
IECore.Canceller.check( canceller )

# Start task and wait for it to be running
task = Gaffer.BackgroundTask( None, f )
started.wait()

# Destroy task. This will cancel it via `canceller`,
# and wait for it to return. If we don't release the
# GIL while doing that then it'll never be able to
# check for cancellation, and we'll deadlock.
del task

if __name__ == "__main__":
unittest.main()
60 changes: 34 additions & 26 deletions src/GafferModule/ParallelAlgoBinding.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -56,28 +56,52 @@ using namespace boost::python;
namespace
{

BackgroundTask *backgroundTaskConstructor( const Plug *subject, object f )
std::shared_ptr<BackgroundTask> withGILReleaseDeleter( std::unique_ptr<BackgroundTask> &backgroundTask )
{
auto fPtr = std::make_shared<boost::python::object>( f );
return new BackgroundTask(
return std::shared_ptr<BackgroundTask>(
backgroundTask.release(),
// Custom deleter. We need to release the GIL when deleting, because the
// destructor waits on the background task, and the background task
// might need the GIL in order to complete.
[]( BackgroundTask *t ) {
IECorePython::ScopedGILRelease gilRelease;
delete t;
}
);
}

std::shared_ptr<boost::python::object> withGILAcquireDeleter( const boost::python::object &o )
{
return std::shared_ptr<boost::python::object>(
new boost::python::object( o ),
[]( boost::python::object *o ) {
// Custom deleter. We must hold the GIL when deleting Python
// objects.
IECorePython::ScopedGILLock gilLock;
delete o;
}
);
}

std::shared_ptr<BackgroundTask> backgroundTaskConstructor( const Plug *subject, object f )
{
auto fPtr = withGILAcquireDeleter( f );
auto backgroundTask = std::make_unique<BackgroundTask>(
subject,
[fPtr]( const IECore::Canceller &canceller ) mutable {
IECorePython::ScopedGILLock gilLock;
try
{
(*fPtr)( boost::ref( canceller ) );
// We are likely to be the last owner of the python
// function object. Make sure we release it while we
// still hold the GIL.
fPtr.reset();
}
catch( boost::python::error_already_set & )
{
fPtr.reset();
IECorePython::ExceptionAlgo::translatePythonException();
}
}
);

return withGILReleaseDeleter( backgroundTask );
}

void backgroundTaskCancel( BackgroundTask &b )
Expand Down Expand Up @@ -203,13 +227,7 @@ std::shared_ptr<BackgroundTask> callOnBackgroundThread( const Plug *subject, boo
// The BackgroundTask we return will own the python function we
// pass to it. Wrap the function so that the GIL is acquired
// before the python object is destroyed.
auto fPtr = std::shared_ptr<boost::python::object>(
new boost::python::object( f ),
[]( boost::python::object *o ) {
IECorePython::ScopedGILLock gilLock;
delete o;
}
);
auto fPtr = withGILAcquireDeleter( f );

auto backgroundTask = ParallelAlgo::callOnBackgroundThread(
subject,
Expand All @@ -226,17 +244,7 @@ std::shared_ptr<BackgroundTask> callOnBackgroundThread( const Plug *subject, boo
}
);

return std::shared_ptr<BackgroundTask>(
backgroundTask.release(),
// Custom deleter. We need to release
// the GIL when deleting, because the destructor
// waits on the background task, and the background
// task might need the GIL in order to complete.
[]( BackgroundTask *t ) {
IECorePython::ScopedGILRelease gilRelease;
delete t;
}
);
return withGILReleaseDeleter( backgroundTask );
}

} // namespace
Expand Down

0 comments on commit ca37126

Please sign in to comment.