[ScheduleDAG] Fix and simplify the algorithm used for finding callseq_start #149692

mshockwave · 2025-07-20T06:53:18Z

ScheduleDAGRR uses FindCallSeqStart to find the matching callseq_start from a callseq_end. It does this search using a combination of DFS-ish + heuristics based on maximum nested level of callseq_start & callseq_end on each search path. But it struggles on cases like #146213 in which search paths that resolved into the same nested level triggered an assertion.
Fundamentally, I don't think heuristics used in the original algorithm is 100% correct: for a given callseq_end, if there are multiple callseq_start on the dependency chains that could fully match it -- even taking into the account of nested callseq_start/end -- then there is probably something wrong in the construction of the DAG. So searching different search paths for maximum nested level might not be necessary.
I simplify this function by visiting related nodes with proper DFS and matches callseq_start/end along the way.

I guess this hasn't been a problem for nearly 15(!) years since the original algorithm was written because only ISAs that require lots of memory movements around function calls (i.e. pushing arguments to stack) are more likely to trigger. And X86 is probably the only ISA other than 68k that qualifies, but I guess X86 has a slightly different chain value arrangement in this case.

…_start

llvmbot · 2025-07-20T06:53:48Z

@llvm/pr-subscribers-backend-m68k

@llvm/pr-subscribers-llvm-selectiondag

Author: Min-Yih Hsu (mshockwave)

Changes

ScheduleDAGRR uses FindCallSeqStart to find the matching callseq_start from a callseq_end. It does this search using a combination of DFS-ish + heuristics based on maximum nested level of callseq_start & callseq_end on each search path. But it struggles on cases like #146213 in which search paths that resolved into the same nested level triggered an assertion.
Fundamentally, I don't think heuristics used in the original algorithm is 100% correct: for a given callseq_end, if there are multiple callseq_start on the dependency chains that could fully match it -- even taking into the account of nested callseq_start/end -- then there is probably something wrong in the construction of the DAG. So searching different search paths for maximum nested level might not be necessary.
I simplify this function by visiting related nodes with proper DFS and matches callseq_start/end along the way.

I guess this hasn't been a problem for nearly 15(!) years since the original algorithm was written because only ISAs that require lots of memory movements around function calls (i.e. pushing arguments to stack) are more likely to trigger. And X86 is probably the only ISA other than 68k that qualifies, but I guess X86 has a slightly different chain value arrangement in this case.

Full diff: https://github.com/llvm/llvm-project/pull/149692.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp (+41-44)
(added) llvm/test/CodeGen/M68k/PR146213.ll (+31)

diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
index a570b71ecd28d..d45e0dbf45df5 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
@@ -478,62 +478,61 @@ static bool IsChainDependent(SDNode *Outer, SDNode *Inner,
   }
 }
 
-/// FindCallSeqStart - Starting from the (lowered) CALLSEQ_END node, locate
+/// findCallSeqStart - Starting from the (lowered) CALLSEQ_END node, locate
 /// the corresponding (lowered) CALLSEQ_BEGIN node.
 ///
-/// NestLevel and MaxNested are used in recursion to indcate the current level
-/// of nesting of CALLSEQ_BEGIN and CALLSEQ_END pairs, as well as the maximum
-/// level seen so far.
-///
 /// TODO: It would be better to give CALLSEQ_END an explicit operand to point
 /// to the corresponding CALLSEQ_BEGIN to avoid needing to search for it.
-static SDNode *
-FindCallSeqStart(SDNode *N, unsigned &NestLevel, unsigned &MaxNest,
-                 const TargetInstrInfo *TII) {
-  while (true) {
-    // For a TokenFactor, examine each operand. There may be multiple ways
-    // to get to the CALLSEQ_BEGIN, but we need to find the path with the
-    // most nesting in order to ensure that we find the corresponding match.
+static SDNode *findCallSeqStart(SDNode *CallSeqEnd,
+                                const TargetInstrInfo *TII) {
+  SmallVector<SDNode *> DFSStack;
+  SmallPtrSet<SDNode *, 8> Visited;
+  unsigned CallSeqDepth = 0;
+
+  DFSStack.push_back(CallSeqEnd);
+  Visited.insert(CallSeqEnd);
+
+  while (!DFSStack.empty()) {
+    SDNode *N = DFSStack.back();
+
     if (N->getOpcode() == ISD::TokenFactor) {
-      SDNode *Best = nullptr;
-      unsigned BestMaxNest = MaxNest;
       for (const SDValue &Op : N->op_values()) {
-        unsigned MyNestLevel = NestLevel;
-        unsigned MyMaxNest = MaxNest;
-        if (SDNode *New = FindCallSeqStart(Op.getNode(),
-                                           MyNestLevel, MyMaxNest, TII))
-          if (!Best || (MyMaxNest > BestMaxNest)) {
-            Best = New;
-            BestMaxNest = MyMaxNest;
-          }
+        if (!Visited.insert(Op.getNode()).second)
+          continue;
+        DFSStack.push_back(Op.getNode());
+        break;
       }
-      assert(Best);
-      MaxNest = BestMaxNest;
-      return Best;
+      if (DFSStack.back() == N)
+        DFSStack.pop_back();
+      continue;
     }
-    // Check for a lowered CALLSEQ_BEGIN or CALLSEQ_END.
+
+    // Not a TokenFactor, no need to keep it in the DFS stack.
+    DFSStack.pop_back();
+
     if (N->isMachineOpcode()) {
       if (N->getMachineOpcode() == TII->getCallFrameDestroyOpcode()) {
-        ++NestLevel;
-        MaxNest = std::max(MaxNest, NestLevel);
+        ++CallSeqDepth;
       } else if (N->getMachineOpcode() == TII->getCallFrameSetupOpcode()) {
-        assert(NestLevel != 0);
-        --NestLevel;
-        if (NestLevel == 0)
+        assert(CallSeqDepth);
+        if (--CallSeqDepth == 0)
           return N;
       }
     }
-    // Otherwise, find the chain and continue climbing.
-    for (const SDValue &Op : N->op_values())
-      if (Op.getValueType() == MVT::Other) {
-        N = Op.getNode();
-        goto found_chain_operand;
-      }
-    return nullptr;
-  found_chain_operand:;
-    if (N->getOpcode() == ISD::EntryToken)
-      return nullptr;
+
+    // Find the chain and continue climbing.
+    // A SDNode should only have a single input chain.
+    auto OpValues = N->op_values();
+    auto ItChain = find_if(OpValues, [](const SDValue &Op) {
+      return Op.getValueType() == MVT::Other;
+    });
+    if (ItChain != OpValues.end() &&
+        (*ItChain)->getOpcode() != ISD::EntryToken &&
+        Visited.insert((*ItChain).getNode()).second)
+      DFSStack.push_back((*ItChain).getNode());
   }
+
+  return nullptr;
 }
 
 /// Call ReleasePred for each predecessor, then update register live def/gen.
@@ -581,9 +580,7 @@ void ScheduleDAGRRList::ReleasePredecessors(SUnit *SU) {
     for (SDNode *Node = SU->getNode(); Node; Node = Node->getGluedNode())
       if (Node->isMachineOpcode() &&
           Node->getMachineOpcode() == TII->getCallFrameDestroyOpcode()) {
-        unsigned NestLevel = 0;
-        unsigned MaxNest = 0;
-        SDNode *N = FindCallSeqStart(Node, NestLevel, MaxNest, TII);
+        SDNode *N = findCallSeqStart(Node, TII);
         assert(N && "Must find call sequence start");
 
         SUnit *Def = &SUnits[N->getNodeId()];
diff --git a/llvm/test/CodeGen/M68k/PR146213.ll b/llvm/test/CodeGen/M68k/PR146213.ll
new file mode 100644
index 0000000000000..a2aca8927ce86
--- /dev/null
+++ b/llvm/test/CodeGen/M68k/PR146213.ll
@@ -0,0 +1,31 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=m68k-linux | FileCheck %s
+
+define double @minimized() nounwind {
+; CHECK-LABEL: minimized:
+; CHECK:       ; %bb.0: ; %start
+; CHECK-NEXT:    suba.l #20, %sp
+; CHECK-NEXT:    movem.l %a2, (16,%sp) ; 8-byte Folded Spill
+; CHECK-NEXT:    move.l #0, (4,%sp)
+; CHECK-NEXT:    move.l #0, (%sp)
+; CHECK-NEXT:    jsr $0
+; CHECK-NEXT:    suba.l %a2, %a2
+; CHECK-NEXT:    move.l (%a2), (%sp)
+; CHECK-NEXT:    move.l #0, (12,%sp)
+; CHECK-NEXT:    move.l #0, (8,%sp)
+; CHECK-NEXT:    move.l $4, (4,%sp)
+; CHECK-NEXT:    jsr __muldf3
+; CHECK-NEXT:    move.l %d0, (%a2)
+; CHECK-NEXT:    move.l %d1, $4
+; CHECK-NEXT:    move.l %a2, %d0
+; CHECK-NEXT:    move.l %a2, %d1
+; CHECK-NEXT:    movem.l (16,%sp), %a2 ; 8-byte Folded Reload
+; CHECK-NEXT:    adda.l #20, %sp
+; CHECK-NEXT:    rts
+start:
+  %_58 = call double null(double 0.000000e+00)
+  %_60 = load double, ptr null, align 8
+  %_57 = fmul double 0.000000e+00, %_60
+  store double %_57, ptr null, align 8
+  ret double 0.000000e+00
+}

Aylos9er

Clean up*

mshockwave · 2025-07-20T17:33:31Z

Clean up*

I don't quite follow, could you point out the part you were commenting on?

[ScheduleDAG] Fix and simplify the algorithm used for finding callseq…

c9ea1ff

…_start

mshockwave requested review from arsenm, RKSimon, sunfishcode and topperc July 20, 2025 06:53

llvmbot added backend:m68k llvm:SelectionDAG SelectionDAGISel as well labels Jul 20, 2025

Aylos9er reviewed Jul 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ScheduleDAG] Fix and simplify the algorithm used for finding callseq_start #149692

[ScheduleDAG] Fix and simplify the algorithm used for finding callseq_start #149692

mshockwave commented Jul 20, 2025

Uh oh!

llvmbot commented Jul 20, 2025 •

edited

Loading

Uh oh!

Aylos9er left a comment

Uh oh!

mshockwave commented Jul 20, 2025

Uh oh!

Uh oh!

[ScheduleDAG] Fix and simplify the algorithm used for finding callseq_start #149692

Are you sure you want to change the base?

[ScheduleDAG] Fix and simplify the algorithm used for finding callseq_start #149692

Conversation

mshockwave commented Jul 20, 2025

Uh oh!

llvmbot commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Aylos9er left a comment

Choose a reason for hiding this comment

Uh oh!

mshockwave commented Jul 20, 2025

Uh oh!

Uh oh!

llvmbot commented Jul 20, 2025 •

edited

Loading