Skip to content

[TableGen] Apply the implied subregidx optimization more widely #149709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jayfoad
Copy link
Contributor

@jayfoad jayfoad commented Jul 20, 2025

inferMatchingSuperRegClass has an optimization to skip subreg indices
that are implied by ones it has already handled. Apply this optimization
for every subreg index that is fully supported by RC.

This gives more than 2x speed-up in inferMatchingSuperRegClass when
generating AMDGPUGenRegisterInfo.inc without changing the contents of
any generated files.

inferMatchingSuperRegClass has an optimization to skip subreg indices
that are implied by ones it has already handled. Apply this optimization
for every subreg index that is fully supported by RC.

This gives more than 2x speed-up in inferMatchingSuperRegClass when
generating AMDGPUGenRegisterInfo.inc without changing the contents of
any generated files.
@llvmbot
Copy link
Member

llvmbot commented Jul 20, 2025

@llvm/pr-subscribers-tablegen

Author: Jay Foad (jayfoad)

Changes

inferMatchingSuperRegClass has an optimization to skip subreg indices
that are implied by ones it has already handled. Apply this optimization
for every subreg index that is fully supported by RC.

This gives more than 2x speed-up in inferMatchingSuperRegClass when
generating AMDGPUGenRegisterInfo.inc without changing the contents of
any generated files.


Full diff: https://github.com/llvm/llvm-project/pull/149709.diff

1 Files Affected:

  • (modified) llvm/utils/TableGen/Common/CodeGenRegisters.cpp (+17-17)
diff --git a/llvm/utils/TableGen/Common/CodeGenRegisters.cpp b/llvm/utils/TableGen/Common/CodeGenRegisters.cpp
index f78427940b276..cd5515e757f8f 100644
--- a/llvm/utils/TableGen/Common/CodeGenRegisters.cpp
+++ b/llvm/utils/TableGen/Common/CodeGenRegisters.cpp
@@ -2352,6 +2352,23 @@ void CodeGenRegBank::inferMatchingSuperRegClass(
     if (ImpliedSubRegIndices.contains(SubIdx))
       continue;
 
+    // We can skip checking subregister indices that can be composed from
+    // the current SubIdx.
+    //
+    // Proof sketch: Let SubRC' be another register class and SubSubIdx
+    // a subregister index that can be composed from SubIdx.
+    //
+    // Calling this function with SubRC in place of RC ensures the existence
+    // of a subclass X of SubRC with the registers that have subregisters in
+    // SubRC'.
+    //
+    // The set of registers in RC with SubSubIdx in SubRC' is equal to the
+    // set of registers in RC with SubIdx in X (because every register in
+    // RC has a corresponding subregister in SubRC), and so checking the
+    // pair (SubSubIdx, SubRC') is redundant with checking (SubIdx, X).
+    for (const auto &SubSubIdx : SubIdx->getComposites())
+      ImpliedSubRegIndices.insert(SubSubIdx.second);
+
     // Build list of (Sub, Super) pairs for this SubIdx, sorted by Sub. Note
     // that the list may contain entries with the same Sub but different Supers.
     SubRegs.clear();
@@ -2390,23 +2407,6 @@ void CodeGenRegBank::inferMatchingSuperRegClass(
       if (SubSetVec.size() == RC->getMembers().size()) {
         SubRC.addSuperRegClass(SubIdx, RC);
 
-        // We can skip checking subregister indices that can be composed from
-        // the current SubIdx.
-        //
-        // Proof sketch: Let SubRC' be another register class and SubSubIdx
-        // a subregister index that can be composed from SubIdx.
-        //
-        // Calling this function with SubRC in place of RC ensures the existence
-        // of a subclass X of SubRC with the registers that have subregisters in
-        // SubRC'.
-        //
-        // The set of registers in RC with SubSubIdx in SubRC' is equal to the
-        // set of registers in RC with SubIdx in X (because every register in
-        // RC has a corresponding subregister in SubRC), and so checking the
-        // pair (SubSubIdx, SubRC') is redundant with checking (SubIdx, X).
-        for (const auto &SubSubIdx : SubIdx->getComposites())
-          ImpliedSubRegIndices.insert(SubSubIdx.second);
-
         continue;
       }
 

@nvjle
Copy link
Contributor

nvjle commented Jul 21, 2025

Hi @jayfoad ,

I quickly tried this on a downstream back-end and see some differences in our XXXGenRegisterInfo.inc. This may or may not be innocuous and I have not investigated in detail. Our static lit tests check out (at best a sanity check), but I/we haven't yet tried to run thousands of shaders for perf/correctness to be sure. I also have not yet had time to context-switch in inferMatchingSuperRegClass (which I never fully understood in the first place), and the below may or may not be helpful.

A quick diff of the generated reg info shows a few differences along these lines for getSubRegisterClass (actual widths replaced by x/y/i/j until I can upstream this back-end):

Old:
     {  // GPRx_U_GPRy
       1,       // GPRx_U_GPRy:dsub0 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub1 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub2 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub3 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub4 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub5 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub6 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub7 -> GPRi_U_GPRj

New:
     {  // GPRx_U_GPRy
       4,       // GPRx_U_GPRy:dsub0 -> GPRi
       4,       // GPRx_U_GPRy:dsub1 -> GPRi
       4,       // GPRx_U_GPRy:dsub2 -> GPRi
       4,       // GPRx_U_GPRy:dsub3 -> GPRi
       4,       // GPRx_U_GPRy:dsub4 -> GPRi
       1,       // GPRx_U_GPRy:dsub5 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub6 -> GPRi_U_GPRj
       1,       // GPRx_U_GPRy:dsub7 -> GPRi_U_GPRj

Notably, all the differences are related to the union classes (RC1 union RC2) we use for Greedy's inflation mechanism. What seems to be happening is that addSuperRegClass below

     // RC injects completely into SubRC.
      if (SubSetVec.size() == RC->getMembers().size()) {
        SubRC.addSuperRegClass(SubIdx, RC);

no longer adds, e.g., GPRx_U_GPRy where it did before.

This is just a heads-up that there is at least one back-end where there are reg info differences, but I cannot yet say whether this is actually a problem (including in our own very complicated register hierarchy with a proliferation of long tuple classes, overlaps, and inflation classes).

@jayfoad
Copy link
Contributor Author

jayfoad commented Jul 21, 2025

Thanks! It certainly makes sense that the diffs would be related to missing addSuperRegClass calls. I will take a closer look at this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants