-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[codegen] assume the tag, not the relative discriminant #144764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[codegen] assume the tag, not the relative discriminant #144764
Conversation
Some changes occurred in compiler/rustc_codegen_ssa |
// CHECK: tail call void @llvm.assume(i1 %[[A_NOT_HOLE]]) | ||
// CHECK: %[[A_DISCR:.+]] = select i1 %[[A_IS_NICHE]], i64 %[[A_REL_DISCR]], i64 1 | ||
// LLVM20: %[[A_DISCR:.+]] = select i1 %[[A_IS_NICHE]], i64 %[[A_REL_DISCR]], i64 1 | ||
// LLVM21: %[[A_MODIFIED_TAG:.+]] = select i1 %[[A_IS_NICHE]], i64 %[[A_TRUNC]], i64 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this select be at the bottom for LLVM 21?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I completely missed that! Thanks.
This looks like a reasonable thing to do to me, but I'm probably the wrong person to check whether the logic is correct. The surrounding niche/discriminant logic looks kinda tricky. |
6eb273e
to
b90e0fb
Compare
r? codegen |
if niche_variants.contains(&untagged_variant) | ||
&& bx.cx().sess().opts.optimize != OptLevel::No | ||
{ | ||
let ne = bx.icmp(IntPredicate::IntNE, tagged_discr, untagged_variant_const); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case this helps review, here's how you can see this just from the diff, rather than from https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/enum.TagEncoding.html#variant.Niche:
Previously we were doing tagged_discr != untagged_variant
here.
But
relative_discr = tag - niche_start
delta = niche_variants.start()
tagged_discr = relative_discr + delta
so
tagged_discr != untagged_variant
=> relative_discr + delta != untagged_variant
=> (tag - niche_start) + niche_variants.start() != untagged_variant
=> tag != niche_start + untagged_variant - niche_variants.start()
which is the calculation on line 522.
@bors r+ |
b90e0fb
to
c396521
Compare
Rebased now that LLVM21 landed, and found locally that |
@bors r=WaffleLapkin rollup=iffy (LLVM21 isn't checked in the PR build, so this might have codegen test failures) |
…nant-assume, r=WaffleLapkin [codegen] assume the tag, not the relative discriminant Address the issue mentioned in <llvm/llvm-project#134024 (comment)> by changing discriminant calculation to `assume` on the originally-loaded `tag`, rather than on `cast(tag)-OFFSET`. The previous way does make the *purpose* of the assume clearer, IMHO, since you see `assume(x != 4); if p { x } else { 4 }`, but doing it this way instead means that the `add`s optimize away in LLVM21, which is more important. And this new way is still easily thought of as being like metadata on the load saying specifically which value is impossible. Demo of the LLVM20 vs LLVM21 difference: <https://llvm.godbolt.org/z/n54x5Mq1T> r? `@nikic`
Rollup of 19 pull requests Successful merges: - #144400 (`tests/ui/issues/`: The Issues Strike Back [3/N]) - #144764 ([codegen] assume the tag, not the relative discriminant) - #144807 (Streamline config in bootstrap) - #144899 (Print CGU reuse statistics in `-Zprint-mono-items`) - #144909 (Add new `test::print_merged_doctests_times` used by rustdoc to display more detailed time information) - #144912 (Resolver: introduce a conditionally mutable Resolver for (non-)speculative resolution.) - #144914 (Add support for `ty::Instance` path shortening in diagnostics) - #144931 ([win][arm64ec] Fix msvc-wholearchive for Arm64EC) - #144999 (coverage: Remove all unstable support for MC/DC instrumentation) - #145009 (A couple small changes for rust-analyzer next-solver work) - #145030 (GVN: Do not flatten derefs with ProjectionElem::Index. ) - #145042 (stdarch subtree update) - #145047 (move `type_check` out of `compute_regions`) - #145051 (Prevent name collisions with internal implementation details) - #145053 (Add a lot of NLL `known-bug` tests) - #145055 (Move metadata symbol export from exported_non_generic_symbols to exported_symbols) - #145057 (Clean up some resolved test regressions of const trait removals in std) - #145068 (Readd myself to review queue) - #145070 (Add minimal `armv7a-vex-v5` tier three target) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #144764 - scottmcm:tweak-impossible-discriminant-assume, r=WaffleLapkin [codegen] assume the tag, not the relative discriminant Address the issue mentioned in <llvm/llvm-project#134024 (comment)> by changing discriminant calculation to `assume` on the originally-loaded `tag`, rather than on `cast(tag)-OFFSET`. The previous way does make the *purpose* of the assume clearer, IMHO, since you see `assume(x != 4); if p { x } else { 4 }`, but doing it this way instead means that the `add`s optimize away in LLVM21, which is more important. And this new way is still easily thought of as being like metadata on the load saying specifically which value is impossible. Demo of the LLVM20 vs LLVM21 difference: <https://llvm.godbolt.org/z/n54x5Mq1T> r? ``@nikic``
Out of curiosity, from #145077: |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (e1b67b6): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -2.5%, secondary 2.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 2.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 464.119s -> 464.902s (0.17%) |
Rollup of 19 pull requests Successful merges: - rust-lang/rust#144400 (`tests/ui/issues/`: The Issues Strike Back [3/N]) - rust-lang/rust#144764 ([codegen] assume the tag, not the relative discriminant) - rust-lang/rust#144807 (Streamline config in bootstrap) - rust-lang/rust#144899 (Print CGU reuse statistics in `-Zprint-mono-items`) - rust-lang/rust#144909 (Add new `test::print_merged_doctests_times` used by rustdoc to display more detailed time information) - rust-lang/rust#144912 (Resolver: introduce a conditionally mutable Resolver for (non-)speculative resolution.) - rust-lang/rust#144914 (Add support for `ty::Instance` path shortening in diagnostics) - rust-lang/rust#144931 ([win][arm64ec] Fix msvc-wholearchive for Arm64EC) - rust-lang/rust#144999 (coverage: Remove all unstable support for MC/DC instrumentation) - rust-lang/rust#145009 (A couple small changes for rust-analyzer next-solver work) - rust-lang/rust#145030 (GVN: Do not flatten derefs with ProjectionElem::Index. ) - rust-lang/rust#145042 (stdarch subtree update) - rust-lang/rust#145047 (move `type_check` out of `compute_regions`) - rust-lang/rust#145051 (Prevent name collisions with internal implementation details) - rust-lang/rust#145053 (Add a lot of NLL `known-bug` tests) - rust-lang/rust#145055 (Move metadata symbol export from exported_non_generic_symbols to exported_symbols) - rust-lang/rust#145057 (Clean up some resolved test regressions of const trait removals in std) - rust-lang/rust#145068 (Readd myself to review queue) - rust-lang/rust#145070 (Add minimal `armv7a-vex-v5` tier three target) r? `@ghost` `@rustbot` modify labels: rollup
Rollup of 19 pull requests Successful merges: - rust-lang/rust#144400 (`tests/ui/issues/`: The Issues Strike Back [3/N]) - rust-lang/rust#144764 ([codegen] assume the tag, not the relative discriminant) - rust-lang/rust#144807 (Streamline config in bootstrap) - rust-lang/rust#144899 (Print CGU reuse statistics in `-Zprint-mono-items`) - rust-lang/rust#144909 (Add new `test::print_merged_doctests_times` used by rustdoc to display more detailed time information) - rust-lang/rust#144912 (Resolver: introduce a conditionally mutable Resolver for (non-)speculative resolution.) - rust-lang/rust#144914 (Add support for `ty::Instance` path shortening in diagnostics) - rust-lang/rust#144931 ([win][arm64ec] Fix msvc-wholearchive for Arm64EC) - rust-lang/rust#144999 (coverage: Remove all unstable support for MC/DC instrumentation) - rust-lang/rust#145009 (A couple small changes for rust-analyzer next-solver work) - rust-lang/rust#145030 (GVN: Do not flatten derefs with ProjectionElem::Index. ) - rust-lang/rust#145042 (stdarch subtree update) - rust-lang/rust#145047 (move `type_check` out of `compute_regions`) - rust-lang/rust#145051 (Prevent name collisions with internal implementation details) - rust-lang/rust#145053 (Add a lot of NLL `known-bug` tests) - rust-lang/rust#145055 (Move metadata symbol export from exported_non_generic_symbols to exported_symbols) - rust-lang/rust#145057 (Clean up some resolved test regressions of const trait removals in std) - rust-lang/rust#145068 (Readd myself to review queue) - rust-lang/rust#145070 (Add minimal `armv7a-vex-v5` tier three target) r? `@ghost` `@rustbot` modify labels: rollup
Address the issue mentioned in llvm/llvm-project#134024 (comment) by changing discriminant calculation to
assume
on the originally-loadedtag
, rather than oncast(tag)-OFFSET
.The previous way does make the purpose of the assume clearer, IMHO, since you see
assume(x != 4); if p { x } else { 4 }
, but doing it this way instead means that theadd
s optimize away in LLVM21, which is more important. And this new way is still easily thought of as being like metadata on the load saying specifically which value is impossible.Demo of the LLVM20 vs LLVM21 difference: https://llvm.godbolt.org/z/n54x5Mq1T
r? @nikic