Skip to content

gh-137400: Fix thread-safety issues when profiling all threads #137518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 13, 2025

Conversation

colesbury
Copy link
Contributor

@colesbury colesbury commented Aug 7, 2025

There were a few thread-safety issues when profiling or tracing all threads via PyEval_SetProfileAllThreads or PyEval_SetTraceAllThreads:

  • The loop over thread states could crash if a thread exits concurrently (in both the free threading and default build)
  • The modification of c_profilefunc and c_tracefunc wasn't thread-safe on the free threading build.

There were a few thread-safety issues when profiling or tracing all
threads via PyEval_SetProfileAllThreads or PyEval_SetTraceAllThreads:

* The loop over thread states could crash if a thread exits concurrently
  (in both the free threading and default build)
* The modification of `c_profilefunc` and `c_tracefunc` wasn't
  thread-safe on the free threading build.
@colesbury colesbury added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Aug 7, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @colesbury for commit 09b28c8 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F137518%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Aug 7, 2025
@colesbury colesbury changed the title gh-137400: Fix a thread-safety issues when profiling all threads gh-137400: Fix thread-safety issues when profiling all threads Aug 7, 2025
@colesbury colesbury added needs backport to 3.14 bugs and security fixes needs backport to 3.13 bugs and security fixes labels Aug 7, 2025
@colesbury colesbury marked this pull request as ready for review August 7, 2025 16:15
@@ -387,6 +413,38 @@ def noop():

self.observe_threads(noop, buf)

def test_trace_concurrent(self):
# Test calling a function concurrently from a tracing and a non-tracing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the behavior of this test when this was not fixed? Will it crash because the code object is partially instrumented, or will we miss some trace func calls because the code object is not instrumented at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly it showed up as data races reported by TSan. I don't think I saw any crashes for this one.

The test above (SetProfileAllThreadsMultiThreaded) would crash before this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarification, this test is related to the change to _MAYBE_INSTRUMENT with int check_instrumentation = (tstate->tracing == 0);

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments

@colesbury colesbury merged commit a10152f into python:main Aug 13, 2025
62 checks passed
@colesbury colesbury deleted the gh-137400-now-with-fewer-crashes branch August 13, 2025 18:15
@miss-islington-app
Copy link

Thanks @colesbury for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Sorry, @colesbury, I could not cleanly backport this to 3.14 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker a10152f8fd0f4b291e53d646cffe22fbeec73e1e 3.14

@miss-islington-app
Copy link

Sorry, @colesbury, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker a10152f8fd0f4b291e53d646cffe22fbeec73e1e 3.13

colesbury added a commit to colesbury/cpython that referenced this pull request Aug 13, 2025
…hreads (pythongh-137518)

There were a few thread-safety issues when profiling or tracing all
threads via PyEval_SetProfileAllThreads or PyEval_SetTraceAllThreads:

* The loop over thread states could crash if a thread exits concurrently
  (in both the free threading and default build)
* The modification of `c_profilefunc` and `c_tracefunc` wasn't
  thread-safe on the free threading build.
(cherry picked from commit a10152f)

Co-authored-by: Sam Gross <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Aug 13, 2025

GH-137730 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label Aug 13, 2025
colesbury added a commit to colesbury/cpython that referenced this pull request Aug 13, 2025
…hreads (pythongh-137518)

There were a few thread-safety issues when profiling or tracing all
threads via PyEval_SetProfileAllThreads or PyEval_SetTraceAllThreads:

* The loop over thread states could crash if a thread exits concurrently
  (in both the free threading and default build)
* The modification of `c_profilefunc` and `c_tracefunc` wasn't
  thread-safe on the free threading build.
(cherry picked from commit a10152f)

Co-authored-by: Sam Gross <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Aug 13, 2025

GH-137733 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants