Skip to content

Conversation

paultiq
Copy link
Contributor

@paultiq paultiq commented Sep 19, 2025

Per discussion, this PR enables free-threading for DuckDB for 3.14t (or later).

This PR involves three steps:

  • Implementing DuckDBPyModuleState container for globals. This is a step to "proper" multi-phase init and is a first step (but not sufficient) towards subinterpreter support. In the context of this PR, the purpose is to control access to global/module state.
  • Tagging the module as py::mod_gil_not_used()
  • Adding scoped_critical_sections to guard against concurrent modifications.

Comments:

  • The scope of the changes, so far, are modest due to DuckDB's inherent thread safety. Testing didn't yield any unexpected problems as long as threads properly use independent cursors.
  • This should be considered highly experimental, since it was tested on release candidate builds of 3.14 and without pyarrow, pandas or polars support.

* If the scope of the change for module state is too much, this can be reworked without it.
** Tagging @ngoldbaum in: he's involved in a lot of other free-threading work and answered a few of my questions!

Python release timeline

Module State

PEP489 - multi-phase-init calls for module_state to be initialized with each interpreter*. This PR takes a half-step towards that, by consolidating global state into a single static DuckDBPyModuleState. The idea being: it's easier to manage concurrency through a single instance than across instances.

Alternative ideas: DuckDBPyModuleState is not absolutely required. The alternative is to keep the import cache/instance cache/etc and add mutexes or scoped_critical_sections to control access.

* each interpreter refers to support for subinterpreters. Multiple subinterpreters is not supported in this PR for a few reasons, primarily due to the difficulty of attaching the import_cache to all the calls that assume a single global import_cache.

py::mod_gil_not_used tag

At module creation, in duckdb_python.cpp, py::mod_gil_not_used is passed for versions >= 3.14 for Py_GIL_DISABLED builds.

Added py::multiple_interpreters::not_supported to make the expected behavior explicit.

#if defined(Py_GIL_DISABLED) && PY_VERSION_HEX >= 0x030e0000
PYBIND11_MODULE(DUCKDB_PYTHON_LIB_NAME, m, py::mod_gil_not_used(),
                py::multiple_interpreters::not_supported()) { // NOLINT
#else
PYBIND11_MODULE(DUCKDB_PYTHON_LIB_NAME, m,
                py::multiple_interpreters::not_supported()) { // NOLINT
#endif

scoped_critical_section

DefaultConnection is synchronized with a py::scoped_critical_section

Since cursors are not thread safe. Segfaults occur in both release & these builds when cursors are used in an unsafe manner.

Testing

A set of threading test cases are added to tests/fast/threading. These tests should pass in both GIL enabled and free-threaded (no-GIL) builds.

To Do / Discuss

  • Is there a better solution to eliminating the global g_module_state? Import cache seems to be the main issue... it's also an opportunity for optimization.
  • Concurrent cursor access - It's relatively easy to segfault with unsafe threaded cursor use. Should this be guarded against (GIL or no-GIL)?
  • Figure out Windows 3.14t build - I've been able to build locally and in some CI workflows, but didn't figure out why the main CI workflow fails tests.
  • Test with Pandas, PyArrow, Polars/etc when they ship their 3.14t builds
  • TSAN tests: cpython_sanity helpfully provides prbuilt CPython w/ TSAN.

@evertlammerts
Copy link
Collaborator

Hey @paultiq, just a first hat-tip to ack the PR, thanks for digging in so extensively!

At the risk of getting ahead of myself (I haven't had the time to look into your changes at all yet): it's likely we'll create a separate branch for this and other freethreading experiments while we're evaluating a couple of approaches. Once it's in main it's going to be harder to go a different direction.

I'm hoping I can make time to look at this and other freethreading work later this or next week. More to come soon.

@paultiq
Copy link
Contributor Author

paultiq commented Sep 22, 2025

Sounds great. I'll pause and wait for review/guidance.

The next task, imo, is to test with Pandas: they just added 3.14t builds to their 3.0 nightly's.

Unrelated to free threading, there's breaking changes in 3.0 around chained expressions and the 'str' types.

Edit: There's also figuring out what's going on with Windows.. the test failures appear legit and not build issues, there's some weirdness where client_contexts aren't released during exceptions. I'll open an issue here, but will wait for a separate branch so I can include a reproducer.

Edit 2: Figured out Windows free threading: needed /EHsc compiler flag. Unsure if this is just some build config issue arising from adding cmake.define.CMAKE_C_FLAGS for \DPy_GIL_DISABLED. Have not committed this change.

@paultiq
Copy link
Contributor Author

paultiq commented Sep 26, 2025

Update: pandas nightly's now has 3.14t builds, so when off hold, will add the nightly build to this branch. Pandas 3.0 breaks some tests, due to str type and CoW change, unrelated to this PR. Didn't see any threading related issues with the current tests- but didn't add any new tests.

Waiting on PyArrow next for 3.14 builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants