-
Notifications
You must be signed in to change notification settings - Fork 156
Add query cache hits to detailed query table #2168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
Kobzol
wants to merge
6
commits into
rust-lang:master
Choose a base branch
from
Kobzol:detailed-query-page-hit-count
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Jul 2, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
rust-bors bot
added a commit
to rust-lang/rust
that referenced
this pull request
Jul 2, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk` try-job: dist-powerpc-linux
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Jul 2, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
github-actions bot
pushed a commit
to rust-lang/rustc-dev-guide
that referenced
this pull request
Jul 3, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
github-actions bot
pushed a commit
to kosai-ksdfounder/miri
that referenced
this pull request
Jul 3, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is blocked on the query hits number actually being gathered in some non-terribly-expensive way in rustc 😅
Blocked on: rust-lang/rust#142978 (and a corresponding
analyzeme
change).