Skip to content

[PATCH 1/7] [clang] Improve nested name specifier AST representation #147835

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mizvekov
Copy link
Contributor

@mizvekov mizvekov commented Jul 9, 2025

This is a major change on how we represent nested name qualifications in the AST.

  • The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy.
  • The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present.
  • TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size.

This patch offers a great performance benefit.

It greatly improves compilation time for stdexec. For one datapoint, for test_on2.cpp in that project, which is the slowest compiling test, this patch improves -c compilation time by about 7.2%, with the -fsyntax-only improvement being at ~12%.

This has great results on compile-time-tracker as well:
image

This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands.

It has some other miscelaneous drive-by fixes.

About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact.

There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work.

How to review: The important changes are all in clang/include/clang/AST and clang/lib/AST, with also important changes in clang/lib/Sema/TreeTransform.h.

The rest and bulk of the changes are mostly consequences of the changes in API.

PS: TagType::getDecl is renamed to getOriginalDecl in this patch, just for easier to rebasing. I plan to rename it back after this lands.

This is split into seven interdependent patches:

  1. This one
  2. [PATCH 2/7] [clang] improve NestedNameSpecifier: misc small clang changes #148012
  3. [PATCH 3/7] [clang] improve NestedNameSpecifier: test changes #148014
  4. [PATCH 4/7] [clang] Improve NestedNameSpecifier: clang-tools-extra changes #148015
  5. [PATCH 5/7] [clang] NNS improvement: getOriginalDecl changes #149747
  6. [PATCH 6/7] [clang] improve NestedNameSpecifier #149748
  7. [PATCH 7/7] [clang] improve NestedNameSpecifier: LLDB changes #149949

Fixes #136624
Fixes #147000
Fixes #43179
Fixes #68670
Fixes #92757

@mizvekov mizvekov force-pushed the users/mizvekov/name-qualification-refactor branch from 374aefc to 8da8c53 Compare July 10, 2025 20:14
@mizvekov mizvekov force-pushed the users/mizvekov/name-qualification-refactor branch from 8da8c53 to 7b74782 Compare July 11, 2025 00:05
@zyn0217
Copy link
Contributor

zyn0217 commented Jul 11, 2025

Given the scale of the patch, it's not surprising if it undergoes some revert-reapplies cycles after the initial commit.

To reduce the churn, would it be possible to ask google (and other major downstream clients) to test it internally before we merge? @AaronBallman

@AaronBallman
Copy link
Collaborator

Given the scale of the patch, it's not surprising if it undergoes some revert-reapplies cycles after the initial commit.

To reduce the churn, would it be possible to ask google (and other major downstream clients) to test it internally before we merge? @AaronBallman

We can certainly ask; it's in their best interests to avoid that amount of churn if there are problems. But I'm not certain we have a way to really do that aside from poke individuals and see whether they're willing to try it out internally or not.

Copy link
Member

@Sirraide Sirraide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, but I will say that I haven’t nearly reviewed all of this because it’s rather massive (and I’m also not too familiar w/ everything around NNSs).

Super,
};

inline Kind getKind() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason a lot of these declarations are inline? Because from what I know that doesn’t really do anything on declarations that aren’t definitions, does it?

Copy link
Contributor Author

@mizvekov mizvekov Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, an inline definition can't be provided here due to circular dependencies, but it is provided in NestedNameSpecifier.h. This is a way to get around the interdependence with Decl.h and Type.h.

@mizvekov mizvekov force-pushed the users/mizvekov/name-qualification-refactor branch 2 times, most recently from cfe89c1 to 42d0f68 Compare July 14, 2025 21:46
@mizvekov mizvekov force-pushed the users/mizvekov/name-qualification-refactor branch 2 times, most recently from 4896f42 to 73ffe31 Compare July 20, 2025 23:41
@mizvekov mizvekov changed the title [PATCH 1/4] [clang] Improve nested name specifier AST representation [PATCH 1/6] [clang] Improve nested name specifier AST representation Jul 20, 2025
@mizvekov mizvekov force-pushed the users/mizvekov/name-qualification-refactor branch from 73ffe31 to d8acbe8 Compare July 21, 2025 00:07
@mizvekov
Copy link
Contributor Author

I have further split up this patch series, now the changes related to the NNS representation are moved last in the patch series: #149748

I have also moved some of the trivial changes into yet another patch: #149747

As a result, this and the other patches in the series are smaller.
Let me know if that's enough, or if you would rather I keep splitting further.

Otherwise, I have addressed all pending reviews, and this is ready for another round.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did as thorough of a review as I could. a small nit, else this is LGTM once the other reviewers who have commented are happy.

@mizvekov mizvekov force-pushed the users/mizvekov/name-qualification-refactor branch from d8acbe8 to 58b801e Compare July 21, 2025 23:55
mizvekov added a commit that referenced this pull request Jul 22, 2025
@mizvekov mizvekov changed the title [PATCH 1/6] [clang] Improve nested name specifier AST representation [PATCH 1/7] [clang] Improve nested name specifier AST representation Jul 22, 2025
mizvekov added a commit that referenced this pull request Jul 22, 2025
This is a major change on how we represent nested name qualifications in the AST.

* The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size.

This patch offers a great performance benefit.

It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%.

This has great results on compile-time-tracker as well:
![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831)

This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands.

It has some other miscelaneous drive-by fixes.

About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact.

There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work.

How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`.

The rest and bulk of the changes are mostly consequences of the changes in API.

PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands.

Fixes #136624
Fixes #147000
mizvekov added a commit that referenced this pull request Jul 22, 2025
@mizvekov mizvekov force-pushed the users/mizvekov/name-qualification-refactor branch from 58b801e to dcef00f Compare July 22, 2025 19:53
mizvekov added a commit that referenced this pull request Jul 22, 2025
mizvekov added a commit that referenced this pull request Jul 22, 2025
mizvekov added a commit that referenced this pull request Jul 22, 2025
@zwuis
Copy link
Contributor

zwuis commented Jul 23, 2025

I added comment // FIXME (GH147000): duplicate diagnostics to clang/test/SemaCXX/nested-name-spec.cpp in #147003. Please remove the comment if that issue is fixed by this patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 backend:AMDGPU backend:ARC backend:ARM backend:CSKY backend:Hexagon backend:Lanai backend:loongarch backend:MIPS backend:PowerPC backend:RISC-V backend:Sparc backend:SystemZ backend:WebAssembly backend:X86 clang:analysis clang:as-a-library libclang and C++ API clang:bytecode Issues for the clang bytecode constexpr interpreter clang:codegen IR generation bugs: mangling, exceptions, etc. clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang:openmp OpenMP related changes to Clang clang:static analyzer clang Clang issues not falling into any other category clang-tidy clang-tools-extra clangd coroutines C++20 coroutines debuginfo HLSL HLSL Language Support
Projects
None yet
9 participants