Skip to content

[CIR] Add support for array constructors #149142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 24, 2025
Merged

[CIR] Add support for array constructors #149142

merged 8 commits into from
Jul 24, 2025

Conversation

mmha
Copy link
Contributor

@mmha mmha commented Jul 16, 2025

This patch upstreams support for creating arrays of classes that require calling a constructor.

  • Adds the ArrayCtor operation
  • New lowering pass for lowering ArrayCtor to a loop

@mmha mmha requested a review from erichkeane July 16, 2025 17:04
@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Jul 16, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 16, 2025

@llvm/pr-subscribers-clangir

@llvm/pr-subscribers-clang

Author: Morris Hafner (mmha)

Changes

This patch upstreams support for creating arrays of classes that require calling a constructor.

  • Adds the ArrayCtor operation
  • New lowering pass for lowering ArrayCtor to a loop

Patch is 24.31 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149142.diff

9 Files Affected:

  • (modified) clang/include/clang/CIR/Dialect/IR/CIROps.td (+37-1)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRTypeConstraints.td (+8)
  • (modified) clang/lib/CIR/CodeGen/CIRGenClass.cpp (+110)
  • (modified) clang/lib/CIR/CodeGen/CIRGenExpr.cpp (+30-29)
  • (modified) clang/lib/CIR/CodeGen/CIRGenFunction.cpp (+45)
  • (modified) clang/lib/CIR/CodeGen/CIRGenFunction.h (+12)
  • (modified) clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp (+85-1)
  • (added) clang/test/CIR/CodeGen/array-ctor.cpp (+70)
  • (added) clang/test/CIR/IR/array-ctor.cir (+29)
diff --git a/clang/include/clang/CIR/Dialect/IR/CIROps.td b/clang/include/clang/CIR/Dialect/IR/CIROps.td
index d19cd83d78b40..4c30a54bc650f 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIROps.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIROps.td
@@ -607,7 +607,7 @@ def CIR_ConditionOp : CIR_Op<"condition", [
 //===----------------------------------------------------------------------===//
 
 defvar CIR_YieldableScopes = [
-  "CaseOp", "DoWhileOp", "ForOp", "IfOp", "ScopeOp", "SwitchOp",
+  "ArrayCtor", "CaseOp", "DoWhileOp", "ForOp", "IfOp", "ScopeOp", "SwitchOp",
   "TernaryOp", "WhileOp"
 ];
 
@@ -2219,6 +2219,42 @@ def CIR_TrapOp : CIR_Op<"trap", [Terminator]> {
   let assemblyFormat = "attr-dict";
 }
 
+//===----------------------------------------------------------------------===//
+// ArrayCtor
+//===----------------------------------------------------------------------===//
+
+class CIR_ArrayInitDestroy<string mnemonic> : CIR_Op<mnemonic> {
+  let arguments = (ins
+    Arg<CIR_PtrToArray, "array address", [MemWrite, MemRead]>:$addr
+  );
+
+  let regions = (region SizedRegion<1>:$body);
+  let assemblyFormat = [{
+    `(` $addr `:` qualified(type($addr)) `)` $body attr-dict
+  }];
+
+  let builders = [
+    OpBuilder<(ins "mlir::Value":$addr,
+      "llvm::function_ref<void(mlir::OpBuilder &, mlir::Location)>":$regionBuilder), [{
+        assert(regionBuilder && "builder callback expected");
+        mlir::OpBuilder::InsertionGuard guard($_builder);
+        mlir::Region *r = $_state.addRegion();
+        $_state.addOperands(ValueRange{addr});
+        $_builder.createBlock(r);
+        regionBuilder($_builder, $_state.location);
+    }]>
+  ];
+}
+
+def CIR_ArrayCtor : CIR_ArrayInitDestroy<"array.ctor"> {
+  let summary = "Initialize array elements with C++ constructors";
+  let description = [{
+    Initialize each array element using the same C++ constructor. This
+    operation has one region, with one single block. The block has an
+    incoming argument for the current array index to initialize.
+  }];
+}
+
 //===----------------------------------------------------------------------===//
 // VecCreate
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRTypeConstraints.td b/clang/include/clang/CIR/Dialect/IR/CIRTypeConstraints.td
index 2bf77583465a6..d7d55dfbc0654 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRTypeConstraints.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRTypeConstraints.td
@@ -165,6 +165,12 @@ def CIR_AnyIntOrFloatType : AnyTypeOf<[CIR_AnyFloatType, CIR_AnyIntType],
 
 def CIR_AnyComplexType : CIR_TypeBase<"::cir::ComplexType", "complex type">;
 
+//===----------------------------------------------------------------------===//
+// Array Type predicates
+//===----------------------------------------------------------------------===//
+
+def CIR_AnyArrayType : CIR_TypeBase<"::cir::ArrayType", "array type">;
+
 //===----------------------------------------------------------------------===//
 // Pointer Type predicates
 //===----------------------------------------------------------------------===//
@@ -216,6 +222,8 @@ def CIR_PtrToIntOrFloatType : CIR_PtrToType<CIR_AnyIntOrFloatType>;
 
 def CIR_PtrToComplexType : CIR_PtrToType<CIR_AnyComplexType>;
 
+def CIR_PtrToArray : CIR_PtrToType<CIR_AnyArrayType>;
+
 //===----------------------------------------------------------------------===//
 // Vector Type predicates
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/CIR/CodeGen/CIRGenClass.cpp b/clang/lib/CIR/CodeGen/CIRGenClass.cpp
index 8667bb60d114e..26ace1c366780 100644
--- a/clang/lib/CIR/CodeGen/CIRGenClass.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenClass.cpp
@@ -12,6 +12,7 @@
 
 #include "CIRGenCXXABI.h"
 #include "CIRGenFunction.h"
+#include "CIRGenValue.h"
 
 #include "clang/AST/ExprCXX.h"
 #include "clang/AST/RecordLayout.h"
@@ -311,6 +312,115 @@ void CIRGenFunction::emitInitializerForField(FieldDecl *field, LValue lhs,
   assert(!cir::MissingFeatures::requiresCleanups());
 }
 
+/// Emit a loop to call a particular constructor for each of several members
+/// of an array.
+///
+/// \param ctor the constructor to call for each element
+/// \param arrayType the type of the array to initialize
+/// \param arrayBegin an arrayType*
+/// \param zeroInitialize true if each element should be
+///   zero-initialized before it is constructed
+void CIRGenFunction::emitCXXAggrConstructorCall(
+    const CXXConstructorDecl *ctor, const clang::ArrayType *arrayType,
+    Address arrayBegin, const CXXConstructExpr *e, bool newPointerIsChecked,
+    bool zeroInitialize) {
+  QualType elementType;
+  mlir::Value numElements = emitArrayLength(arrayType, elementType, arrayBegin);
+  emitCXXAggrConstructorCall(ctor, numElements, arrayBegin, e,
+                             newPointerIsChecked, zeroInitialize);
+}
+
+/// Emit a loop to call a particular constructor for each of several members
+/// of an array.
+///
+/// \param ctor the constructor to call for each element
+/// \param numElements the number of elements in the array;
+///   may be zero
+/// \param arrayBase a T*, where T is the type constructed by ctor
+/// \param zeroInitialize true if each element should be
+///   zero-initialized before it is constructed
+void CIRGenFunction::emitCXXAggrConstructorCall(
+    const CXXConstructorDecl *ctor, mlir::Value numElements, Address arrayBase,
+    const CXXConstructExpr *e, bool newPointerIsChecked, bool zeroInitialize) {
+  // It's legal for numElements to be zero.  This can happen both
+  // dynamically, because x can be zero in 'new A[x]', and statically,
+  // because of GCC extensions that permit zero-length arrays.  There
+  // are probably legitimate places where we could assume that this
+  // doesn't happen, but it's not clear that it's worth it.
+
+  // Optimize for a constant count.
+  auto constantCount = dyn_cast<cir::ConstantOp>(numElements.getDefiningOp());
+  if (constantCount) {
+    auto constIntAttr = mlir::dyn_cast<cir::IntAttr>(constantCount.getValue());
+    // Just skip out if the constant count is zero.
+    if (constIntAttr && constIntAttr.getUInt() == 0)
+      return;
+    // Otherwise, emit the check.
+  } else {
+    cgm.errorNYI(e->getSourceRange(), "dynamic-length array expression");
+  }
+
+  auto arrayTy = mlir::dyn_cast<cir::ArrayType>(arrayBase.getElementType());
+  assert(arrayTy && "expected array type");
+  mlir::Type elementType = arrayTy.getElementType();
+  cir::PointerType ptrToElmType = builder.getPointerTo(elementType);
+
+  // Tradional LLVM codegen emits a loop here. CIR lowers to a loop as part of
+  // LoweringPrepare.
+
+  // The alignment of the base, adjusted by the size of a single element,
+  // provides a conservative estimate of the alignment of every element.
+  // (This assumes we never start tracking offsetted alignments.)
+  //
+  // Note that these are complete objects and so we don't need to
+  // use the non-virtual size or alignment.
+  QualType type = getContext().getTypeDeclType(ctor->getParent());
+  CharUnits eltAlignment = arrayBase.getAlignment().alignmentOfArrayElement(
+      getContext().getTypeSizeInChars(type));
+
+  // Zero initialize the storage, if requested.
+  if (zeroInitialize)
+    emitNullInitialization(*currSrcLoc, arrayBase, type);
+
+  // C++ [class.temporary]p4:
+  // There are two contexts in which temporaries are destroyed at a different
+  // point than the end of the full-expression. The first context is when a
+  // default constructor is called to initialize an element of an array.
+  // If the constructor has one or more default arguments, the destruction of
+  // every temporary created in a default argument expression is sequenced
+  // before the construction of the next array element, if any.
+  {
+    assert(!cir::MissingFeatures::runCleanupsScope());
+
+    // Evaluate the constructor and its arguments in a regular
+    // partial-destroy cleanup.
+    if (getLangOpts().Exceptions &&
+        !ctor->getParent()->hasTrivialDestructor()) {
+      cgm.errorNYI(e->getSourceRange(), "partial array cleanups");
+    }
+
+    // Emit the constructor call that will execute for every array element.
+    auto arrayOp = builder.createPtrBitcast(arrayBase.getPointer(), arrayTy);
+    builder.create<cir::ArrayCtor>(
+        *currSrcLoc, arrayOp, [&](mlir::OpBuilder &b, mlir::Location loc) {
+          auto arg = b.getInsertionBlock()->addArgument(ptrToElmType, loc);
+          Address curAddr = Address(arg, elementType, eltAlignment);
+          assert(!cir::MissingFeatures::sanitizers());
+          auto currAVS = AggValueSlot::forAddr(
+              curAddr, type.getQualifiers(), AggValueSlot::IsDestructed,
+              AggValueSlot::IsNotAliased, AggValueSlot::DoesNotOverlap,
+              AggValueSlot::IsNotZeroed);
+          emitCXXConstructorCall(ctor, Ctor_Complete,
+                                 /*ForVirtualBase=*/false,
+                                 /*Delegating=*/false, currAVS, e);
+          builder.create<cir::YieldOp>(loc);
+        });
+  }
+
+  if (constantCount.use_empty())
+    constantCount.erase();
+}
+
 void CIRGenFunction::emitDelegateCXXConstructorCall(
     const CXXConstructorDecl *ctor, CXXCtorType ctorType,
     const FunctionArgList &args, SourceLocation loc) {
diff --git a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
index 51da48d330f55..9f4d883066055 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
@@ -1594,37 +1594,38 @@ void CIRGenFunction::emitCXXConstructExpr(const CXXConstructExpr *e,
     return;
   }
 
-  if (getContext().getAsArrayType(e->getType())) {
-    cgm.errorNYI(e->getSourceRange(), "emitCXXConstructExpr: array type");
-    return;
-  }
+  if (const ArrayType *arrayType = getContext().getAsArrayType(e->getType())) {
+    assert(!cir::MissingFeatures::sanitizers());
+    emitCXXAggrConstructorCall(cd, arrayType, dest.getAddress(), e, false);
+  } else {
 
-  clang::CXXCtorType type = Ctor_Complete;
-  bool forVirtualBase = false;
-  bool delegating = false;
-
-  switch (e->getConstructionKind()) {
-  case CXXConstructionKind::Complete:
-    type = Ctor_Complete;
-    break;
-  case CXXConstructionKind::Delegating:
-    // We should be emitting a constructor; GlobalDecl will assert this
-    type = curGD.getCtorType();
-    delegating = true;
-    break;
-  case CXXConstructionKind::VirtualBase:
-    // This should just set 'forVirtualBase' to true and fall through, but
-    // virtual base class support is otherwise missing, so this needs to wait
-    // until it can be tested.
-    cgm.errorNYI(e->getSourceRange(),
-                 "emitCXXConstructExpr: virtual base constructor");
-    return;
-  case CXXConstructionKind::NonVirtualBase:
-    type = Ctor_Base;
-    break;
-  }
+    clang::CXXCtorType type = Ctor_Complete;
+    bool forVirtualBase = false;
+    bool delegating = false;
 
-  emitCXXConstructorCall(cd, type, forVirtualBase, delegating, dest, e);
+    switch (e->getConstructionKind()) {
+    case CXXConstructionKind::Complete:
+      type = Ctor_Complete;
+      break;
+    case CXXConstructionKind::Delegating:
+      // We should be emitting a constructor; GlobalDecl will assert this
+      type = curGD.getCtorType();
+      delegating = true;
+      break;
+    case CXXConstructionKind::VirtualBase:
+      // This should just set 'forVirtualBase' to true and fall through, but
+      // virtual base class support is otherwise missing, so this needs to wait
+      // until it can be tested.
+      cgm.errorNYI(e->getSourceRange(),
+                   "emitCXXConstructExpr: virtual base constructor");
+      return;
+    case CXXConstructionKind::NonVirtualBase:
+      type = Ctor_Base;
+      break;
+    }
+
+    emitCXXConstructorCall(cd, type, forVirtualBase, delegating, dest, e);
+  }
 }
 
 RValue CIRGenFunction::emitReferenceBindingToExpr(const Expr *e) {
diff --git a/clang/lib/CIR/CodeGen/CIRGenFunction.cpp b/clang/lib/CIR/CodeGen/CIRGenFunction.cpp
index e532b9d855843..bdc1df41c26e8 100644
--- a/clang/lib/CIR/CodeGen/CIRGenFunction.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenFunction.cpp
@@ -805,4 +805,49 @@ bool CIRGenFunction::shouldNullCheckClassCastValue(const CastExpr *ce) {
   return true;
 }
 
+/// Computes the length of an array in elements, as well as the base
+/// element type and a properly-typed first element pointer.
+mlir::Value
+CIRGenFunction::emitArrayLength(const clang::ArrayType *origArrayType,
+                                QualType &baseType, Address &addr) {
+  const clang::ArrayType *arrayType = origArrayType;
+
+  // If it's a VLA, we have to load the stored size.  Note that
+  // this is the size of the VLA in bytes, not its size in elements.
+  if (isa<VariableArrayType>(arrayType)) {
+    cgm.errorNYI(*currSrcLoc, "VLAs");
+    return builder.getConstInt(*currSrcLoc, SizeTy, 0);
+  }
+
+  uint64_t countFromCLAs = 1;
+  QualType eltType;
+
+  auto cirArrayType = mlir::dyn_cast<cir::ArrayType>(addr.getElementType());
+
+  while (cirArrayType) {
+    assert(isa<ConstantArrayType>(arrayType));
+    countFromCLAs *= cirArrayType.getSize();
+    eltType = arrayType->getElementType();
+
+    cirArrayType =
+        mlir::dyn_cast<cir::ArrayType>(cirArrayType.getElementType());
+
+    arrayType = getContext().getAsArrayType(arrayType->getElementType());
+    assert((!cirArrayType || arrayType) &&
+           "CIR and Clang types are out-of-sync");
+  }
+
+  if (arrayType) {
+    // From this point onwards, the Clang array type has been emitted
+    // as some other type (probably a packed struct). Compute the array
+    // size, and just emit the 'begin' expression as a bitcast.
+    cgm.errorNYI(*currSrcLoc, "length for non-array underlying types");
+  }
+
+  baseType = eltType;
+  auto numElements = builder.getConstInt(*currSrcLoc, SizeTy, countFromCLAs);
+
+  return numElements;
+}
+
 } // namespace clang::CIRGen
diff --git a/clang/lib/CIR/CodeGen/CIRGenFunction.h b/clang/lib/CIR/CodeGen/CIRGenFunction.h
index 3baabba5adfe1..b4f79338ae81e 100644
--- a/clang/lib/CIR/CodeGen/CIRGenFunction.h
+++ b/clang/lib/CIR/CodeGen/CIRGenFunction.h
@@ -761,6 +761,8 @@ class CIRGenFunction : public CIRGenTypeCache {
   /// even if no aggregate location is provided.
   RValue emitAnyExprToTemp(const clang::Expr *e);
 
+  mlir::Value emitArrayLength(const clang::ArrayType *arrayType,
+                              QualType &baseType, Address &addr);
   LValue emitArraySubscriptExpr(const clang::ArraySubscriptExpr *e);
 
   Address emitArrayToPointerDecay(const Expr *array);
@@ -837,6 +839,16 @@ class CIRGenFunction : public CIRGenTypeCache {
   void emitCXXConstructExpr(const clang::CXXConstructExpr *e,
                             AggValueSlot dest);
 
+  void emitCXXAggrConstructorCall(const CXXConstructorDecl *ctor,
+                                  const clang::ArrayType *arrayType,
+                                  Address arrayBegin, const CXXConstructExpr *e,
+                                  bool newPointerIsChecked,
+                                  bool zeroInitialize = false);
+  void emitCXXAggrConstructorCall(const CXXConstructorDecl *ctor,
+                                  mlir::Value numElements, Address arrayBase,
+                                  const CXXConstructExpr *e,
+                                  bool newPointerIsChecked,
+                                  bool zeroInitialize);
   void emitCXXConstructorCall(const clang::CXXConstructorDecl *d,
                               clang::CXXCtorType type, bool forVirtualBase,
                               bool delegating, AggValueSlot thisAVS,
diff --git a/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp b/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
index 5493b86a0a321..d32d47761b485 100644
--- a/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+++ b/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
@@ -8,6 +8,8 @@
 
 #include "PassDetail.h"
 #include "clang/AST/ASTContext.h"
+#include "clang/AST/CharUnits.h"
+#include "clang/CIR/Dialect/Builder/CIRBaseBuilder.h"
 #include "clang/CIR/Dialect/IR/CIRDialect.h"
 #include "clang/CIR/Dialect/Passes.h"
 
@@ -22,15 +24,97 @@ struct LoweringPreparePass : public LoweringPrepareBase<LoweringPreparePass> {
   void runOnOperation() override;
 
   void runOnOp(Operation *op);
+  void lowerArrayCtor(ArrayCtor op);
 };
 
 } // namespace
 
-void LoweringPreparePass::runOnOp(Operation *op) {}
+void LoweringPreparePass::runOnOp(Operation *op) {
+  if (auto arrayCtor = dyn_cast<ArrayCtor>(op)) {
+    lowerArrayCtor(arrayCtor);
+  }
+}
+
+static void lowerArrayDtorCtorIntoLoop(CIRBaseBuilderTy &builder,
+                                       mlir::Operation *op, mlir::Type eltTy,
+                                       mlir::Value arrayAddr,
+                                       uint64_t arrayLen) {
+  // Generate loop to call into ctor/dtor for every element.
+  Location loc = op->getLoc();
+
+  // TODO: instead of fixed integer size, create alias for PtrDiffTy and unify
+  // with CIRGen stuff.
+  auto ptrDiffTy =
+      cir::IntType::get(builder.getContext(), 64, /*isSigned=*/false);
+  auto numArrayElementsConst = builder.create<cir::ConstantOp>(
+      loc, ptrDiffTy, cir::IntAttr::get(ptrDiffTy, arrayLen));
+
+  auto begin = builder.create<cir::CastOp>(
+      loc, eltTy, cir::CastKind::array_to_ptrdecay, arrayAddr);
+  mlir::Value end = builder.create<cir::PtrStrideOp>(loc, eltTy, begin,
+                                                     numArrayElementsConst);
+
+  mlir::Value tmpAddr = builder.createAlloca(
+      loc, /*addr type*/ builder.getPointerTo(eltTy),
+      /*var type*/ eltTy, "__array_idx", builder.getAlignmentAttr(1));
+  builder.createStore(loc, begin, tmpAddr);
+
+  cir::DoWhileOp loop = builder.createDoWhile(
+      loc,
+      /*condBuilder=*/
+      [&](mlir::OpBuilder &b, mlir::Location loc) {
+        auto currentElement = b.create<cir::LoadOp>(loc, eltTy, tmpAddr);
+        mlir::Type boolTy = cir::BoolType::get(b.getContext());
+        auto cmp = builder.create<cir::CmpOp>(loc, boolTy, cir::CmpOpKind::eq,
+                                              currentElement, end);
+        builder.createCondition(cmp);
+      },
+      /*bodyBuilder=*/
+      [&](mlir::OpBuilder &b, mlir::Location loc) {
+        auto currentElement = b.create<cir::LoadOp>(loc, eltTy, tmpAddr);
+
+        CallOp ctorCall;
+        op->walk([&](CallOp c) { ctorCall = c; });
+        assert(ctorCall && "expected ctor call");
+
+        auto one = builder.create<cir::ConstantOp>(
+            loc, ptrDiffTy, cir::IntAttr::get(ptrDiffTy, 1));
+
+        ctorCall->moveAfter(one);
+        ctorCall->setOperand(0, currentElement);
+
+        // Advance pointer and store them to temporary variable
+        auto nextElement =
+            builder.create<cir::PtrStrideOp>(loc, eltTy, currentElement, one);
+        builder.createStore(loc, nextElement, tmpAddr);
+        builder.createYield(loc);
+      });
+
+  op->replaceAllUsesWith(loop);
+  op->erase();
+}
+
+void LoweringPreparePass::lowerArrayCtor(ArrayCtor op) {
+  CIRBaseBuilderTy builder(getContext());
+  builder.setInsertionPointAfter(op.getOperation());
+
+  Type eltTy = op->getRegion(0).getArgument(0).getType();
+  auto arrayLen =
+      mlir::cast<cir::ArrayType>(op.getAddr().getType().getPointee()).getSize();
+  lowerArrayDtorCtorIntoLoop(builder, op, eltTy, op.getAddr(), arrayLen);
+}
 
 void LoweringPreparePass::runOnOperation() {
+  Operation *op = getOperation();
+
   llvm::SmallVector<Operation *> opsToTransform;
 
+  op->walk([&](Operation *op) {
+    if (isa<ArrayCtor>(op)) {
+      opsToTransform.push_back(op);
+    }
+  });
+
   for (auto *o : opsToTransform)
     runOnOp(o);
 }
diff --git a/clang/test/CIR/CodeGen/array-ctor.cpp b/clang/test/CIR/CodeGen/array-ctor.cpp
new file mode 100644
index 0000000000000..0a2661248f908
--- /dev/null
+++ b/clang/test/CIR/CodeGen/array-ctor.cpp
@@ -0,0 +1,70 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -Wno-unused-value -fclangir -emit-cir %s -o %t.cir
+// RUN: FileCheck --input-file=%t.cir %s -check-prefix=CIR
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -Wno-unused-value -fclangir -emit-llvm %s -o %t-cir.ll
+// RUN: FileCheck --input-f...
[truncated]

// Just skip out if the constant count is zero.
if (constIntAttr && constIntAttr.getUInt() == 0)
return;
// Otherwise, emit the check.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this down into the else block?

if (constantCount) {
auto constIntAttr = mlir::dyn_cast<cir::IntAttr>(constantCount.getValue());
// Just skip out if the constant count is zero.
if (constIntAttr && constIntAttr.getUInt() == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test for the zero case?

cgm.errorNYI(e->getSourceRange(), "dynamic-length array expression");
}

auto arrayTy = mlir::dyn_cast<cir::ArrayType>(arrayBase.getElementType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit confusing that we have two successive calls to functions named getElementType(). The first one (Addresss::getElementType()) is actually returning the pointee type of the address, while the second (ArrayType::getElementType()) does actually return the type of the elements in the array. It might be helpful to have a comment explaining this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth renaming Address::getElementType() to Address::getPointeeType() then?

S s[42];
}

// CIR: cir.func dso_local @_Z3foov()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a set of checks to verify the CIR before LoweringPrepare is run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment LoweringPrepare is run unconditionally. I could add an option but is there an easier way to do this?

Other than that I have added a test in clang/test/CIR/IR/array-ctor.cir that ensures CIR containing cir.array.ctor remains identical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use -mmlir --mlir-print-ir-before=cir-lowering-prepare to dump the CIR to stderr before the LoweringPrepare pass. You can find an example in clang/test/CIR/CodeGen/complex-unary.cpp.


Type eltTy = op->getRegion(0).getArgument(0).getType();
auto arrayLen =
mlir::cast<cir::ArrayType>(op.getAddr().getType().getPointee()).getSize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a missing feature here for variable-length arrays?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bcardosolopes Is the plan to extend ArrayC/DtorOp to VLAs? My impression is that cir.array is intentionally made to be constant sized only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point Morris. I think we can extend them to have an Optional dynamic size input similar to CIR_AllocaOp. This would require relaxing the $addr constraints in tablegen and making sure a handwritten one checks for the appropriate type when the optional is not present.

Alternatively we could create a new op (cir.array.ctor.vla?), but extending existing ones seems simple and reasonable to me.

mlir::Value end = builder.create<cir::PtrStrideOp>(loc, eltTy, begin,
numArrayElementsConst);

mlir::Value tmpAddr = builder.createAlloca(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be doing something here to find the group of alloca ops in the entry block and insert this after the last of them.

Alternatively, can we eliminate the alloca entirely? Classic codegen doesn't create one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, I think this is OK as it is.

let description = [{
Initialize each array element using the same C++ constructor. This
operation has one region, with one single block. The block has an
incoming argument for the current array index to initialize.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add example here please.

cgm.errorNYI(e->getSourceRange(), "dynamic-length array expression");
}

auto arrayTy = mlir::dyn_cast<cir::ArrayType>(arrayBase.getElementType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth renaming Address::getElementType() to Address::getPointeeType() then?


let regions = (region SizedRegion<1>:$body);
let assemblyFormat = [{
`(` $addr `:` qualified(type($addr)) `)` $body attr-dict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While here, can we get rid of parens? https://llvm.github.io/clangir/Dialect/cir-style-guide.html

```mlir
cir.array.ctor(%0 : !cir.ptr<!cir.array<!rec_S x 42>>) {
^bb0(%arg0: !cir.ptr<!rec_S>):
cir.call @_ZN1SC1Ev(%arg0) : (!cir.ptr<!rec_S>) -> ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ZN1SC1Ev -> some_ctor would be more clear for the non-initiated

baseType = eltType;
cir::ConstantOp numElements = builder.getConstInt(*currSrcLoc, SizeTy, countFromCLAs);

return numElements;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just return buider.getConstInt directly?


Type eltTy = op->getRegion(0).getArgument(0).getType();
auto arrayLen =
mlir::cast<cir::ArrayType>(op.getAddr().getType().getPointee()).getSize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point Morris. I think we can extend them to have an Optional dynamic size input similar to CIR_AllocaOp. This would require relaxing the $addr constraints in tablegen and making sure a handwritten one checks for the appropriate type when the optional is not present.

Alternatively we could create a new op (cir.array.ctor.vla?), but extending existing ones seems simple and reasonable to me.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, with one nit remaining and a couple of test requests

void foo() {
S s[42];
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a multi-dimensional array test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried. Good catch, it doesn't work :) The array to pointer decay is incorrect in the incubator CodeGen (and probably in LoweringPrepare, too). I'm looking into getting this fixed in this PR and the incubator.

auto arrayOp = builder.createPtrBitcast(arrayBase.getPointer(), arrayTy);
builder.create<cir::ArrayCtor>(
*currSrcLoc, arrayOp, [&](mlir::OpBuilder &b, mlir::Location loc) {
auto arg = b.getInsertionBlock()->addArgument(ptrToElmType, loc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumping this nit.

mlir::Value end = builder.create<cir::PtrStrideOp>(loc, eltTy, begin,
numArrayElementsConst);

mlir::Value tmpAddr = builder.createAlloca(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, I think this is OK as it is.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

mmha and others added 8 commits July 24, 2025 16:38
This patch upstreams support for creating arrays of classes that require calling a constructor.

* Adds the ArrayCtor operation
* New lowering pass for lowering ArrayCtor to a loop
Co-authored-by: Andy Kaylor <[email protected]>
Co-authored-by: Henrich Lauko <[email protected]>
* Remove use of auto
@mmha mmha merged commit 3e9d369 into llvm:main Jul 24, 2025
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants