* feat(subinterpreter): add opt-in TLS-cached thread state mode
subinterpreter_scoped_activate previously created and destroyed a fresh
PyThreadState on every activation when the calling OS thread was not
already running the target interpreter. Workloads that repeatedly
re-enter the same sub-interpreter from the same thread therefore churn
thread states and lose per-thread interpreter state between activations
(see pybind/pybind11#6040).
Add an opt-in subinterpreter_thread_state::cached policy: on first use a
PyThreadState is created and stored in OS-thread-local storage keyed by
the target interpreter; subsequent activations on that thread only swap
it in/out and never destroy it. The default stays transient, so existing
behavior is unchanged.
Since pybind11 does not control thread lifetime, cleanup is explicit:
subinterpreter::release_cached_thread_state() releases the calling
thread's cached state for one interpreter, and the static
release_all_cached_thread_states() releases all of the calling thread's
cached states as an end-of-thread hook. The TLS map's destructor only
frees its own nodes and never touches the Python C API, so an
unreleased state leaks rather than crashing at thread exit.
Includes test coverage and embedding docs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* style: pre-commit fixes
* refactor(subinterpreter): replace cached enum/TLS with subinterpreter_thread_state RAII
Address review feedback on the original "cached" mode by switching to an
explicit two-RAII design suggested by @b-pass:
"Create a class ... to RAII-manage the PyThreadState but start its
lifetime in an already released state. You could create another
class (or modify scoped_activate) to scoped/RAII activate the
inactive threadstate."
Removed
- enum subinterpreter_thread_state { transient, cached } and the
defaulted ctor parameter on subinterpreter_scoped_activate.
- detail::subinterpreter_thread_state_cache thread_local map.
- subinterpreter::release_cached_thread_state() and
subinterpreter::release_all_cached_thread_states().
This eliminates: the hidden per-thread map, the "release_all" footgun
across pybind11 modules (the cache was module-local), and the implicit
"must not be active when called" contract on the release functions.
Added
- Public class subinterpreter_thread_state that owns one PyThreadState
for a given subinterpreter on its constructing OS thread, created in
a released state (not current, no GIL). Non-copyable, non-movable
(PyThreadState is bound to its creating OS thread).
- subinterpreter_scoped_activate(subinterpreter_thread_state &)
overload: swaps the owned PyThreadState in on entry, swaps it out
on exit, does not touch its lifetime.
Behavior
- The existing subinterpreter_scoped_activate(subinterpreter const &)
overload is unchanged (still transient: New on entry, Delete on
exit). All previously-working code keeps working.
- With subinterpreter_thread_state, one OS thread can alternate
between multiple subinterpreters and each PyThreadState is preserved
across activations -- the use case that gil_scoped_release/acquire
+ a long-lived scoped_activate cannot solve alone (the per-thread
internals.tstate slot holds only one inactive tstate).
- The dtor of subinterpreter_thread_state guards against the
"destroyed-while-active" contract violation: if Swap reveals the
cached tstate was current, do not Swap back to a now-deleted
pointer (the safe-when-active fix b-pass requested for the old
release_* functions, applied at the natural location instead).
Lifetime contract is enforced by ordinary C++ scope: typical placement
is `thread_local`. No new release/cleanup APIs are required.
Tests cover (a) tstate identity preserved across activations on a
thread, (b) transient and reusing modes do not share state, (c)
different OS threads get distinct PyThreadStates, and (d) the
multi-subinterpreter alternation case.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(subinterpreter): address review on #6073 (same-thread checks, test scoping)
Per @b-pass's review:
- ~subinterpreter_thread_state(): add a PYBIND11_DETAILED_ERROR_MESSAGES-
guarded check that destruction happens on the OS thread that created the
PyThreadState (same PyThread_get_thread_native_id pattern as ~subinterpreter),
failing with pybind11_fail otherwise.
- subinterpreter_scoped_activate(subinterpreter_thread_state &): add the
matching DETAILED_ERROR_MESSAGES check that activation happens on the
creating OS thread, enforcing the newly documented rule.
- docs: document that activating a subinterpreter_thread_state on another OS
thread is illegal.
- tests: keep each subinterpreter (and its subinterpreter_thread_state) in an
enclosing scope so destruction order is thread-state -> subinterpreter ->
unsafe_reset_internals_for_single_interpreter(). The previous top-level
declarations ran the reset while the subinterpreters were still alive, which
is the likely cause of the CI crashes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs: fix codespell (re-used -> reused) in embedding.rst
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* build: support Eigen 5
fix#6034
* build: probe Eigen 3 and 5 separately in CMake config mode
Avoid relying on package-specific handling of a bounded version range when discovering Eigen through Eigen3Config.cmake.
Made-with: Cursor
* [skip ci] build: clarify Eigen 5 module fallback comment
Explain that the MODULE-mode fallback only exists for older Eigen 3 setups so the remaining fallback path does not look like an unresolved Eigen 5 issue.
Made-with: Cursor
* [skip ci] docs: add Eigen 5 entry to v3.0.4 changelog
Document the Eigen 5 CMake package detection fix in the 3.0.4 release notes before merging the PR.
Made-with: Cursor
---------
Co-authored-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
Document the post-v3.0.3 fixes and CI changes ahead of the patch release so the release prep can be reviewed before the version bump work.
Made-with: Cursor
* init
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* Add constexpr to is_floating_point check
This is known at compile time so it can be constexpr
* Allow noconvert float to accept int
* Update noconvert documentation
* Allow noconvert complex to accept int and float
* Add complex strict test
* style: pre-commit fixes
* Update unit tests so int, becomes double.
* style: pre-commit fixes
* remove if (constexpr)
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* fix spelling error
* bump order in #else
* Switch order in c++11 only section
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* ci: trigger build
* ci: trigger build
* Allow casting from float to int
The int type caster allows anything that implements __int__ with explicit exception of the python float. I can't see any reason for this.
This modifies the int casting behaviour to accept a float.
If the argument is marked as noconvert() it will only accept int.
* tests for py::float into int
* Update complex_cast tests
* Add SupportsIndex to int and float
* style: pre-commit fixes
* fix assert
* Update docs to mention other conversions
* fix pypy __index__ problems
* style: pre-commit fixes
* extract out PyLong_AsLong __index__ deprecation
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* style: pre-commit fixes
* Add back env.deprecated_call
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* remove note
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* remove untrue comment
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* fix noconvert_args
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* resolve error
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* Add comment
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* [skip ci]
tests: Add overload resolution test for float/int breaking change
Add test_overload_resolution_float_int() to explicitly test the breaking
change where int arguments now match float overloads when registered first.
The existing tests verify conversion behavior (int -> float, int/float -> complex)
but do not test overload resolution when both float and int overloads exist.
This test fills that gap by:
- Testing that float overload registered before int overload matches int(42)
- Testing strict mode (noconvert) overload resolution breaking change
- Testing complex overload resolution with int/float/complex overloads
- Documenting the breaking change explicitly
This complements existing tests which verify 'can it convert?' by testing
'which overload wins when multiple can convert?'
* Add test to verify that custom __index__ objects (not PyLong) work correctly with complex conversion. These should be consistent across CPython, PyPy, and GraalPy.
* Improve comment clarity for PyPy __index__ handling
Replace cryptic 'So: PYBIND11_INDEX_CHECK(src.ptr())' comment with
clearer explanation of the logic:
- Explains that we need to call PyNumber_Index explicitly on PyPy
for non-PyLong objects
- Clarifies the relationship to the outer condition: when convert
is false, we only reach this point if PYBIND11_INDEX_CHECK passed
above
This makes the code more maintainable and easier to understand
during review.
* Undo inconsequential change to regex in test_enum.py
During merge, HEAD's regex pattern was kept, but master's version is preferred.
The order of ` ` and `\|` in the character class is arbitrary. Keep master's order
(already fixed in PR #5891; sorry I missed looking back here when working on 5891).
* test_methods_and_attributes.py: Restore existing `m.overload_order(1.1)` call and clearly explain the behavior change.
* Reject float → int conversion even in convert mode
Enabling implicit float → int conversion in convert mode causes
silent truncation (e.g., 1.9 → 1). This is dangerous because:
1. It's implicit - users don't expect truncation when calling functions
2. It's silent - no warning or error
3. It can hide bugs - precision loss is hard to detect
This change restores the explicit rejection of PyFloat_Check for integer
casters, even in convert mode. This is more in line with Python's behavior
where int(1.9) must be explicit.
Note that the int → float conversion in noconvert mode is preserved,
as that's a safe widening conversion.
* Revert test changes that sidestepped implicit float→int conversion
This reverts all test modifications that were made to accommodate
implicit float→int conversion in convert mode. With the production
code change that explicitly rejects float→int conversion even in
convert mode, these test workarounds are no longer needed.
Changes reverted:
- test_builtin_casters.py: Restored cant_convert(3.14159) and
np.float32 conversion with deprecated_call wrapper
- test_custom_type_casters.py: Restored TypeError expectation for
m.ints_preferred(4.0)
- test_methods_and_attributes.py: Restored TypeError expectation
for m.overload_order(1.1)
- test_stl.py: Restored float literals (2.0) that were replaced with
strings to avoid conversion
- test_factory_constructors.py: Restored original constructor calls
that were modified to avoid float→int conversion
Also removes the unused avoid_PyLong_AsLong_deprecation fixture
and related TypeVar imports, as all uses were removed.
* Replace env.deprecated_call() with pytest.deprecated_call()
The env.deprecated_call() function was removed, but two test cases
still reference it. Replace with pytest.deprecated_call(), which is
the standard pytest context manager for handling deprecation warnings.
Since we already require pytest>=6 (see tests/requirements.txt), the
compatibility function is obsolete and pytest.deprecated_call() is
available.
* Update test expectations for swapped NoisyAlloc overloads
PR 5879 swapped the order of NoisyAlloc constructor overloads:
- (int i, double) is now placement new (comes first)
- (double d, double) is now factory pointer (comes second)
This swap is necessary because pybind11 tries overloads in order
until one matches. With int → float conversion now allowed:
- create_and_destroy(4, 0.5): Without the swap, (double d, double)
would match first (since int → double conversion is allowed),
bypassing the more specific (int i, double) overload. With the
swap, (int i, double) matches first (exact match), which is
correct.
- create_and_destroy(3.5, 4.5): (int i, double) fails (float → int
is rejected), then (double d, double) matches, which is correct.
The swap ensures exact int matches are preferred over double matches
when an int is provided, which is the expected overload resolution
behavior.
Update the test expectations to match the new overload resolution
order.
* Resolve clang-tidy error:
/__w/pybind11/pybind11/include/pybind11/cast.h:253:46: error: repeated branch body in conditional chain [bugprone-branch-clone,-warnings-as-errors]
253 | } else if (PyFloat_Check(src.ptr())) {
| ^
/__w/pybind11/pybind11/include/pybind11/cast.h:258:10: note: end of the original
258 | } else if (convert || PYBIND11_LONG_CHECK(src.ptr()) || PYBIND11_INDEX_CHECK(src.ptr())) {
| ^
/__w/pybind11/pybind11/include/pybind11/cast.h:283:16: note: clone 1 starts here
283 | } else {
| ^
* Add test coverage for __index__ and __int__ edge cases: incorrectly returning float
These tests ensure that:
- Invalid return types (floats) are properly rejected
- The fallback from __index__ to __int__ works correctly in convert mode
- noconvert mode correctly prevents fallback when __index__ fails
* Minor comment-only changes: add PR number, for easy future reference
* Ensure we are not leaking a Python error is something is wrong elsewhere (e.g. UB, or bug in Python beta testing).
See also: https://github.com/pybind/pybind11/pull/5879#issuecomment-3521099331
* [skip ci] Bump PYBIND11_INTERNALS_VERSION to 12 (for PRs 5879, 5887, 5960)
---------
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
Co-authored-by: gentlegiantJGC <gentlegiantJGC@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* Bump internals version
* Prevent internals destruction before all pybind11 types are destroyed
* Use Py_XINCREF and Py_XDECREF
* Hold GIL before decref
* Use weakrefs
* Remove unused code
* Move code location
* Move code location
* Move code location
* Try add tests
* Fix PYTHONPATH
* Fix PYTHONPATH
* Skip tests for subprocess
* Revert to leak internals
* Revert to leak internals
* Revert "Revert to leak internals"
This reverts commit c5ec1cf886.
This reverts commit 72c2e0aa9b.
* Revert internals version bump
* Reapply to leak internals
This reverts commit 8f25a254e8.
* Add re-entrancy detection for internals creation
Prevent re-creation of internals after destruction during interpreter
shutdown. If pybind11 code runs after internals have been destroyed,
fail early with a clear error message instead of silently creating
new empty internals that would cause type lookup failures.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Fix C++11/C++14 support
* Add lock under multiple interpreters
* Try fix tests
* Try fix tests
* Try fix tests
* Update comments and assertion messages
* Update comments and assertion messages
* Update comments
* Update lock scope
* Use original pointer type for Windows
* Change hard error to warning
* Update lock scope
* Update lock scope to resolve deadlock
* Remove scope release of GIL
* Update comments
* Lock pp on reset
* Mark content created after assignment
* Update comments
* Simplify implementation
* Update lock scope when delete unique_ptr
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* docs: seed 3.0.2 changelog from needs-changelog PRs
Collect suggested entries early to streamline release prep.
* Misc trivial manual fixes.
* Shorten changelog entry for PR 5862
* Remove mention of a minor doc formatting fix.
* Cursor-generated "all past-tense" style
* Restore the meaning of the 5958 entry using the "... now ..." trick, and restore a couple other entries that also use the "now" trick.
* Replace ... now ... style with ... updated to ... style
* [skip ci] docs: group 3.0.2 entries under Internal heading
Align changelog categories with recent releases for review.
* Update changelog with CMake policy compatibility fix
Fix compatibility with CMake policy CMP0190 for cross-compiling.
* Add changelog entries for 5965 and 5968
* docs: make CMP0190 changelog entry past tense
Align 3.0.2 bug-fix entry with changelog style.
* [skip ci] docs: add missing 3.0.2 changelog entries
Capture remaining needs-changelog PRs across categories. (These slipped through the cracks somehow.)
---------
Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com>
* Remove skip for Move Subinterpreter test on free-threaded Python 3.14+
* Fix deadlock by detaching from the main interpreter before joining the thread.
* style: pre-commit fixes
---------
Co-authored-by: b-pass <b-pass@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Remove enum from bold in doc
* [skip ci] Remove bold formatting around (see #5528)
---------
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* Enhance: edit doc py::native_enum feature in upgrade.rst
Added information about the inclusion requirement for py::native_enum feature.
* [skip ci] Polish wording
---------
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
When comparing buffer types there are some edge cases on some platforms that are equivalent but the format string is not identical.
item_type_is_equivalent_to is more forgiving than direct string comparison.
Created using [mini-swe-agent](https://mini-swe-agent.com) and the propmt:
I'd like to find usages of PYBIND11_MODULE in the docs folder and add py::mod_gil_not_used() as a third argument if there ar
e only two arguments. These are examples, and it's really a good idea to always include that now.
I removed a few of the changes.
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* [skip ci] Small docs/release.rst update, mainly to warn about `git push --tags`.
* Remove mention of `git push --tags`
Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com>
---------
Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com>
* Update docs/changelog.md and change version to v3.0.0 (final)
* [skip ci] Add `|SPEC 4 — Using and Creating Nightly Wheels|` badge in main README.rst
* feat: scoped_critical_section
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* refactor: pull out to file
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* style: pre-commit fixes
* fix: GIL code in some compilers
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* fix: move to correct spot
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
---------
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* refactor: use CPython macros to construct `PYBIND11_VERSION_HEX`
* docs: update release guide
* tests: add test to keep version values in sync
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* style: pre-commit fixes
* test: update version test
* test: update version test
* test: update version test
* chore: update code comments
* Update docs/release.rst
---------
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
Co-authored-by: Henry Schreiner <henryschreineriii@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* First draft a subinterpreter embedding API
* Move subinterpreter tests to their own file
* Migrate subinterpreter tests to use the new embedded class.
* Add a test for moving subinterpreters across threads for destruction
And find a better way to make that work.
* Code organization
* Add a test which shows demostrates how gil_scoped interacts with sub-interpreters
* Add documentation for embeded sub-interpreters
* Some additional docs work
* Add some convenience accessors
* Add some docs cross references
* Sync some things that were split out into #5665
* Update subinterpreter docs example to not use the CPython api
* Fix pip test
* style: pre-commit fixes
* Fix MSVC warnings
I am surprised other compilers allowed this code with a deleted move ctor.
* Add some sub-headings to the docs
* Oops, make_unique is C++14 so remove it from the tests.
* I think this fixes the EndInterpreter issues on all versions.
It just has to be ifdef'd because it is slightly broken on 3.12, working well on 3.13, and kind of crashy on 3.14beta. These two verion ifdefs solve all the issues.
* Add a note about exceptions.
They contain Python object references and acquire the GIL, that means they are a danger with subinterpreters!
* style: pre-commit fixes
* Add try/catch to docs examples to match the tips
* Python 3.12 is very picky about this first PyThreadState
Try special casing the destruction on the same thread.
* style: pre-commit fixes
* Missed a rename in a ifdef block
* I think this test is causing problems in 3.12, so try ifdefing it to see if the problems go away.
* style: pre-commit fixes
* Document the 3.12 constraints with a warning
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* ci: add cpptest to the clang-tidy job
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* noexcept move operations
* Update include/pybind11/subinterpreter.h
std::memset
Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
---------
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
* Squashed prepv300/manuscript — 30b9c268aeb98308ea42aaccfd5fe454e173c6fc — 2025-03-30 14:56:03 -0700 [skip ci]
[Browse prepv300/manuscript tree](30b9c268ae)
[Browse prepv300/manuscript commits](30b9c268ae/)
* docs: update changelog
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* docs: upgrade guide CMake suggestions
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* Explain type_caster_enum_type_enabled, copyable_holder_caster_shared_ptr_with_smart_holder_support_enabled, move_only_holder_caster_unique_ptr_with_smart_holder_support_enabled in Upgrade guide.
* Add a small section for py::bind_vector, py::bind_map & py::smart_holder
* Fix tiny oversight: Reference back to the current release v2.13 (not v2.12)
* Remove sentence: Using self._pybind11_conduit_v1_() ... should keep extension compatibility.
This isn't true, because we also modernized `PYBIND11_PLATFORM_ABI_ID`
(which I believe was absolutely necessary). I think it'll be too complicated
to explain that here, and there is a mention in the Upgrade guide.
* Changelog: combine #4953 and #5439
* Trivial whitespace/formatting fixes/enhancements.
* chore: add more to deprecation page
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* docs: update for recent additions
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* docs: fixes and set rc1 version
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* fix: support rc versions
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* Undo erroneous copilot change: We need to use `detail::enable_if_t`, for compatibility with C++11 and C++14.
* Empty lines cleanup.
* Rewording of "CMake support now defaults to ..." paragraph.
* Add missing backticks in upgrade guide.
* Try :ref:deprecated instead of :doc:deprecated
* docs: last bit of polish
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
* Piggy-back trivial whitespace cleanup that was missed in PR #5669
---------
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
Co-authored-by: Henry Schreiner <henryschreineriii@gmail.com>
* Move embedded modules to multiphase init
So that they too can support multi-interpreter and nogil tags
* Update the multiple interpreter test for embedded module changes
* Add a note to embedded module docs about the new tags
* Oops, missed a warning pop
* Remove unused variable
* Update ci.yml
* Fix this embedded GIL test for free-threading
* Oops, need to use ptr() here
* This test created a subinterpreter when PYBIND11_SUBINTERPRETER_SUPPORT was off
So the fix is really this test should not be run in these older versions at all.
The hang was a GIL issue between the subinterpreters during pybind11::exception::what().
* fix: standard mutex for 3.13t
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
---------
Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>
Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com>