-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[NFC][mlir][mesh,shard] Fixing misnomers in mesh dialect, renaming 'mesh' dialect to 'shard' #150177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir-func @llvm/pr-subscribers-mlir-core Author: Frank Schlimbach (fschlimb) Changes[mlir][mesh,shard] Fixing misnomers in mesh dialect (discourse 87053)
A lot of diffs, but simple renames only. @tkarna @yaochengji Patch is 633.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150177.diff 100 Files Affected:
diff --git a/mlir/docs/Dialects/Mesh.md b/mlir/docs/Dialects/Shard.md
similarity index 73%
rename from mlir/docs/Dialects/Mesh.md
rename to mlir/docs/Dialects/Shard.md
index 5eb6569c7044b..714b340db4cde 100644
--- a/mlir/docs/Dialects/Mesh.md
+++ b/mlir/docs/Dialects/Shard.md
@@ -1,28 +1,28 @@
-# 'mesh' Dialect
+# 'shard' Dialect
-The `mesh` dialect contains a set of attributes, operations and interfaces that
-are useful for representing sharding and communication on a device mesh
+The `shard` dialect contains a set of attributes, operations and interfaces that
+are useful for representing sharding and communication on a device grid
cluster.
[TOC]
## Collective Communication Operations
-There are a number of operations in the Mesh dialect to facilitate
-communication between devices in a mesh.
+There are a number of operations in the Shard dialect to facilitate
+communication between devices in a grid.
It is assumed that the user is familiar with collective operations.
[Wikipedia](https://en.wikipedia.org/wiki/Collective_operation) has a good
explanation.
-The main addition is that the collectives in this dialect have mesh
+The main addition is that the collectives in this dialect have grid
semantics.
### Device groups
-The operation attributes `mesh` and `mesh_axes` specifies a list of device mesh
+The operation attributes `grid` and `grid_axes` specifies a list of device grid
axes that partition the devices into disjoint groups.
The collective operation is performed between devices in the same group.
-Devices that have the same coordinates outside of axes `mesh_axes` are in the
+Devices that have the same coordinates outside of axes `grid_axes` are in the
same group.
-A group is described by its multi-index along the axes outside of `mesh_axes`.
-For example if we have a device mesh of size `2x3x4x5` and the partition mesh
+A group is described by its multi-index along the axes outside of `grid_axes`.
+For example if we have a device grid of size `2x3x4x5` and the partition grid
axes list is `[0, 1]` then devices are partitioned into the groups
`{ { (i, j, k, m) | 0<=i<2, 0<=j<3 } | 0<=k<4, 0<=m<5 }`.
The device groups would be `{ (k, m) | 0<=k<4, 0<=m<5 }`.
@@ -31,7 +31,7 @@ Device (1, 0, 2, 4) will be in another group.
Some collective operations like all-to-all and all-gather care about the
order of devices.
The order of device in a device group is induced by the order of axes in
-`mesh_axes`.
+`grid_axes`.
The axes are ordered from outer to inner.
If we have an axis list `[3, 1]` then device `(i, 1, k, 0)` will precede
both devices `(i, 0, k, 1)` and `(i, 2, k, 0)`.
@@ -39,11 +39,11 @@ both devices `(i, 0, k, 1)` and `(i, 2, k, 0)`.
### In-group Device
Some operations like `broadcast`, `scatter` and `send` specify devices in each
device-group.
-These devices are represented with their multi-index over the mesh axes that
+These devices are represented with their multi-index over the grid axes that
are not constant within a device group.
-These are the axes specified by `mesh_axes` attribute.
+These are the axes specified by `grid_axes` attribute.
-For Example on a 3D mesh an operation with `mesh_axes = [0, 2]` would specify
+For Example on a 3D grid an operation with `grid_axes = [0, 2]` would specify
an in-group device with `(i, j)`. Then for each group with index `g` on the
second axis, the in-group device would be `(i, g, j)`.
### Purity
@@ -60,15 +60,15 @@ For example if a collective operation is optimized out, than it must also
not appear in any path of execution on any process.
Having the operations as `Pure` implies that if an interpreter is to execute
-the IR containing the `mesh` collectives, all processes would execute the same
+the IR containing the `grid` collectives, all processes would execute the same
line when they reach a pure collective operation.
This requirement stems from the need to be compatible with general optimization
passes like dead code and common sub-expression elimination.
## Operations
-[include "Dialects/MeshOps.md"]
+[include "Dialects/ShardOps.md"]
## Attributes
-[include "Dialects/MeshAttrs.md"]
+[include "Dialects/ShardAttrs.md"]
diff --git a/mlir/docs/Passes.md b/mlir/docs/Passes.md
index e9d22d1e3dfac..9df32666415bb 100644
--- a/mlir/docs/Passes.md
+++ b/mlir/docs/Passes.md
@@ -72,9 +72,9 @@ This document describes the available MLIR passes and their contracts.
[include "MemRefPasses.md"]
-## 'mesh' Dialect Passes
+## 'shard' Dialect Passes
-[include "MeshPasses.md"]
+[include "ShardPasses.md"]
## 'ml\_program' Dialect Passes
diff --git a/mlir/include/mlir/Conversion/Passes.h b/mlir/include/mlir/Conversion/Passes.h
index d93fbefab74aa..3dc48b2201cf2 100644
--- a/mlir/include/mlir/Conversion/Passes.h
+++ b/mlir/include/mlir/Conversion/Passes.h
@@ -52,7 +52,6 @@
#include "mlir/Conversion/MemRefToEmitC/MemRefToEmitCPass.h"
#include "mlir/Conversion/MemRefToLLVM/MemRefToLLVM.h"
#include "mlir/Conversion/MemRefToSPIRV/MemRefToSPIRVPass.h"
-#include "mlir/Conversion/MeshToMPI/MeshToMPI.h"
#include "mlir/Conversion/NVGPUToNVVM/NVGPUToNVVM.h"
#include "mlir/Conversion/NVVMToLLVM/NVVMToLLVM.h"
#include "mlir/Conversion/OpenACCToSCF/ConvertOpenACCToSCF.h"
@@ -66,6 +65,7 @@
#include "mlir/Conversion/SCFToSPIRV/SCFToSPIRVPass.h"
#include "mlir/Conversion/SPIRVToLLVM/SPIRVToLLVMPass.h"
#include "mlir/Conversion/ShapeToStandard/ShapeToStandard.h"
+#include "mlir/Conversion/ShardToMPI/ShardToMPI.h"
#include "mlir/Conversion/TensorToLinalg/TensorToLinalgPass.h"
#include "mlir/Conversion/TensorToSPIRV/TensorToSPIRVPass.h"
#include "mlir/Conversion/TosaToArith/TosaToArith.h"
diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td
index 8183f355795a9..eb18160ea2eeb 100644
--- a/mlir/include/mlir/Conversion/Passes.td
+++ b/mlir/include/mlir/Conversion/Passes.td
@@ -903,13 +903,13 @@ def ConvertMemRefToSPIRVPass : Pass<"convert-memref-to-spirv"> {
}
//===----------------------------------------------------------------------===//
-// MeshToMPI
+// ShardToMPI
//===----------------------------------------------------------------------===//
-def ConvertMeshToMPIPass : Pass<"convert-mesh-to-mpi"> {
- let summary = "Convert Mesh dialect to MPI dialect.";
+def ConvertShardToMPIPass : Pass<"convert-shard-to-mpi"> {
+ let summary = "Convert Shard dialect to MPI dialect.";
let description = [{
- This pass converts communication operations from the Mesh dialect to the
+ This pass converts communication operations from the Shard dialect to the
MPI dialect.
If it finds the DLTI attribute "MPI:comm_world-rank" on the module it will
use that integer value instead of calling MPI_Comm_rank. This allows
diff --git a/mlir/include/mlir/Conversion/MeshToMPI/MeshToMPI.h b/mlir/include/mlir/Conversion/ShardToMPI/ShardToMPI.h
similarity index 64%
rename from mlir/include/mlir/Conversion/MeshToMPI/MeshToMPI.h
rename to mlir/include/mlir/Conversion/ShardToMPI/ShardToMPI.h
index bc64e7a3c1c8c..b1aa08c432249 100644
--- a/mlir/include/mlir/Conversion/MeshToMPI/MeshToMPI.h
+++ b/mlir/include/mlir/Conversion/ShardToMPI/ShardToMPI.h
@@ -1,4 +1,4 @@
-//===- MeshToMPI.h - Convert Mesh to MPI dialect ----------------*- C++ -*-===//
+//===- ShardToMPI.h - Convert Shard to MPI dialect --------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -6,8 +6,8 @@
//
//===----------------------------------------------------------------------===//
-#ifndef MLIR_CONVERSION_MESHTOMPI_MESHTOMPI_H
-#define MLIR_CONVERSION_MESHTOMPI_MESHTOMPI_H
+#ifndef MLIR_CONVERSION_SHARDTOMPI_SHARDTOMPI_H
+#define MLIR_CONVERSION_SHARDTOMPI_SHARDTOMPI_H
#include "mlir/Pass/Pass.h"
#include "mlir/Support/LLVM.h"
@@ -15,9 +15,9 @@
namespace mlir {
class Pass;
-#define GEN_PASS_DECL_CONVERTMESHTOMPIPASS
+#define GEN_PASS_DECL_CONVERTSHARDTOMPIPASS
#include "mlir/Conversion/Passes.h.inc"
} // namespace mlir
-#endif // MLIR_CONVERSION_MESHTOMPI_MESHTOMPI_H
+#endif // MLIR_CONVERSION_SHARDTOMPI_SHARDTOMPI_H
diff --git a/mlir/include/mlir/Dialect/CMakeLists.txt b/mlir/include/mlir/Dialect/CMakeLists.txt
index 56dc97282fa4a..e27b1679c2a52 100644
--- a/mlir/include/mlir/Dialect/CMakeLists.txt
+++ b/mlir/include/mlir/Dialect/CMakeLists.txt
@@ -19,7 +19,7 @@ add_subdirectory(Linalg)
add_subdirectory(LLVMIR)
add_subdirectory(Math)
add_subdirectory(MemRef)
-add_subdirectory(Mesh)
+add_subdirectory(Shard)
add_subdirectory(MLProgram)
add_subdirectory(MPI)
add_subdirectory(NVGPU)
diff --git a/mlir/include/mlir/Dialect/Func/Extensions/MeshShardingExtensions.h b/mlir/include/mlir/Dialect/Func/Extensions/ShardingExtensions.h
similarity index 88%
rename from mlir/include/mlir/Dialect/Func/Extensions/MeshShardingExtensions.h
rename to mlir/include/mlir/Dialect/Func/Extensions/ShardingExtensions.h
index 30d3033209d21..e22b24b3446bb 100644
--- a/mlir/include/mlir/Dialect/Func/Extensions/MeshShardingExtensions.h
+++ b/mlir/include/mlir/Dialect/Func/Extensions/ShardingExtensions.h
@@ -1,4 +1,4 @@
-//===- MeshShardingExtensions.h - -----------------------------------------===//
+//===- ShardingExtensions.h - -----------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
diff --git a/mlir/include/mlir/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.h b/mlir/include/mlir/Dialect/Linalg/Transforms/ShardingInterfaceImpl.h
similarity index 54%
rename from mlir/include/mlir/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.h
rename to mlir/include/mlir/Dialect/Linalg/Transforms/ShardingInterfaceImpl.h
index c57501ea86b7e..dc21bc05a2dc1 100644
--- a/mlir/include/mlir/Dialect/Linalg/Transforms/MeshShardingInterfaceImpl.h
+++ b/mlir/include/mlir/Dialect/Linalg/Transforms/ShardingInterfaceImpl.h
@@ -1,4 +1,4 @@
-//===- MeshShardingInterfaceImpl.h ----------------------------------------===//
+//===- ShardingInterfaceImpl.h ----------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -6,15 +6,15 @@
//
//===----------------------------------------------------------------------===//
-#ifndef MLIR_DIALECT_LINALG_MESHSHARDINGINTERFACEIMPL_H
-#define MLIR_DIALECT_LINALG_MESHSHARDINGINTERFACEIMPL_H
+#ifndef MLIR_DIALECT_LINALG_SHARDSHARDINGINTERFACEIMPL_H
+#define MLIR_DIALECT_LINALG_SHARDSHARDINGINTERFACEIMPL_H
namespace mlir {
class DialectRegistry;
namespace linalg {
-void registerMeshShardingInterfaceExternalModels(DialectRegistry ®istry);
+void registerShardingInterfaceExternalModels(DialectRegistry ®istry);
} // namespace linalg
} // namespace mlir
-#endif // MLIR_DIALECT_LINALG_MESHSHARDINGINTERFACEIMPL_H
+#endif // MLIR_DIALECT_LINALG_SHARDSHARDINGINTERFACEIMPL_H
diff --git a/mlir/include/mlir/Dialect/Mesh/IR/CMakeLists.txt b/mlir/include/mlir/Dialect/Mesh/IR/CMakeLists.txt
deleted file mode 100644
index f26c6285efd89..0000000000000
--- a/mlir/include/mlir/Dialect/Mesh/IR/CMakeLists.txt
+++ /dev/null
@@ -1,25 +0,0 @@
-add_mlir_doc(MeshOps MeshOps Dialects/ -gen-op-doc -dialect=mesh)
-add_mlir_doc(MeshOps MeshAttrs Dialects/ -gen-attrdef-doc -dialect=mesh)
-
-set(LLVM_TARGET_DEFINITIONS MeshOps.td)
-mlir_tablegen(MeshDialect.cpp.inc -gen-dialect-defs -dialect=mesh)
-mlir_tablegen(MeshDialect.h.inc -gen-dialect-decls -dialect=mesh)
-
-set(LLVM_TARGET_DEFINITIONS MeshBase.td)
-mlir_tablegen(MeshAttributes.h.inc -gen-attrdef-decls)
-mlir_tablegen(MeshAttributes.cpp.inc -gen-attrdef-defs)
-
-set(LLVM_TARGET_DEFINITIONS MeshBase.td)
-mlir_tablegen(MeshEnums.h.inc -gen-enum-decls)
-mlir_tablegen(MeshEnums.cpp.inc -gen-enum-defs)
-
-set(LLVM_TARGET_DEFINITIONS MeshBase.td)
-mlir_tablegen(MeshTypes.h.inc -gen-typedef-decls)
-mlir_tablegen(MeshTypes.cpp.inc -gen-typedef-defs)
-
-set(LLVM_TARGET_DEFINITIONS MeshOps.td)
-mlir_tablegen(MeshOps.h.inc -gen-op-decls)
-mlir_tablegen(MeshOps.cpp.inc -gen-op-defs)
-
-add_public_tablegen_target(MLIRMeshIncGen)
-add_dependencies(mlir-headers MLIRMeshIncGen)
diff --git a/mlir/include/mlir/Dialect/Mesh/Transforms/CMakeLists.txt b/mlir/include/mlir/Dialect/Mesh/Transforms/CMakeLists.txt
deleted file mode 100644
index 8d768485103b6..0000000000000
--- a/mlir/include/mlir/Dialect/Mesh/Transforms/CMakeLists.txt
+++ /dev/null
@@ -1,6 +0,0 @@
-set(LLVM_TARGET_DEFINITIONS Passes.td)
-mlir_tablegen(Passes.h.inc -gen-pass-decls -name Mesh)
-add_public_tablegen_target(MLIRMeshPassIncGen)
-add_dependencies(mlir-headers MLIRMeshPassIncGen)
-
-add_mlir_doc(Passes MeshPasses ./ -gen-pass-doc)
diff --git a/mlir/include/mlir/Dialect/Mesh/CMakeLists.txt b/mlir/include/mlir/Dialect/Shard/CMakeLists.txt
similarity index 100%
rename from mlir/include/mlir/Dialect/Mesh/CMakeLists.txt
rename to mlir/include/mlir/Dialect/Shard/CMakeLists.txt
diff --git a/mlir/include/mlir/Dialect/Shard/IR/CMakeLists.txt b/mlir/include/mlir/Dialect/Shard/IR/CMakeLists.txt
new file mode 100644
index 0000000000000..a2495af135899
--- /dev/null
+++ b/mlir/include/mlir/Dialect/Shard/IR/CMakeLists.txt
@@ -0,0 +1,25 @@
+add_mlir_doc(ShardOps ShardOps Dialects/ -gen-op-doc -dialect=shard)
+add_mlir_doc(ShardOps ShardAttrs Dialects/ -gen-attrdef-doc -dialect=shard)
+
+set(LLVM_TARGET_DEFINITIONS ShardOps.td)
+mlir_tablegen(ShardDialect.cpp.inc -gen-dialect-defs -dialect=shard)
+mlir_tablegen(ShardDialect.h.inc -gen-dialect-decls -dialect=shard)
+
+set(LLVM_TARGET_DEFINITIONS ShardBase.td)
+mlir_tablegen(ShardAttributes.h.inc -gen-attrdef-decls)
+mlir_tablegen(ShardAttributes.cpp.inc -gen-attrdef-defs)
+
+set(LLVM_TARGET_DEFINITIONS ShardBase.td)
+mlir_tablegen(ShardEnums.h.inc -gen-enum-decls)
+mlir_tablegen(ShardEnums.cpp.inc -gen-enum-defs)
+
+set(LLVM_TARGET_DEFINITIONS ShardBase.td)
+mlir_tablegen(ShardTypes.h.inc -gen-typedef-decls)
+mlir_tablegen(ShardTypes.cpp.inc -gen-typedef-defs)
+
+set(LLVM_TARGET_DEFINITIONS ShardOps.td)
+mlir_tablegen(ShardOps.h.inc -gen-op-decls)
+mlir_tablegen(ShardOps.cpp.inc -gen-op-defs)
+
+add_public_tablegen_target(MLIRShardIncGen)
+add_dependencies(mlir-headers MLIRShardIncGen)
diff --git a/mlir/include/mlir/Dialect/Mesh/IR/MeshBase.td b/mlir/include/mlir/Dialect/Shard/IR/ShardBase.td
similarity index 64%
rename from mlir/include/mlir/Dialect/Mesh/IR/MeshBase.td
rename to mlir/include/mlir/Dialect/Shard/IR/ShardBase.td
index 61403ac178980..41ae31807c825 100644
--- a/mlir/include/mlir/Dialect/Mesh/IR/MeshBase.td
+++ b/mlir/include/mlir/Dialect/Shard/IR/ShardBase.td
@@ -1,4 +1,4 @@
-//===- MeshBase.td - Mesh Dialect --------------------------*- tablegen -*-===//
+//===- ShardBase.td - Shard Dialect ------------------------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -6,8 +6,8 @@
//
//===----------------------------------------------------------------------===//
-#ifndef MLIR_DIALECT_MESH_IR_MESHBASE_TD
-#define MLIR_DIALECT_MESH_IR_MESHBASE_TD
+#ifndef MLIR_DIALECT_SHARD_IR_SHARDBASE_TD
+#define MLIR_DIALECT_SHARD_IR_SHARDBASE_TD
include "mlir/IR/OpBase.td"
include "mlir/IR/AttrTypeBase.td"
@@ -16,15 +16,15 @@ include "mlir/IR/CommonAttrConstraints.td"
include "mlir/IR/EnumAttr.td"
//===----------------------------------------------------------------------===//
-// Mesh Dialect
+// Shard Dialect
//===----------------------------------------------------------------------===//
-def Mesh_Dialect : Dialect {
- let name = "mesh";
- let cppNamespace = "::mlir::mesh";
+def Shard_Dialect : Dialect {
+ let name = "shard";
+ let cppNamespace = "::mlir::shard";
let description = [{
- See [Mesh dialect documentation](mlir/docs/Dialects/Mesh.md).
+ See [Shard dialect documentation](mlir/docs/Dialects/Shard.md).
}];
let dependentDialects = [
@@ -36,16 +36,16 @@ def Mesh_Dialect : Dialect {
let hasConstantMaterializer = 1;
}
-def Mesh_MeshAxis : I<16>;
-def Mesh_MeshAxesAttr : DenseArrayAttrBase<"DenseI16ArrayAttr", "int16_t", "i16">;
-def Mesh_ShardShapeAttr : DenseArrayAttrBase<"DenseI64ArrayAttr", "int64_t", "i64">;
+def Shard_GridAxis : I<16>;
+def Shard_GridAxesAttr : DenseArrayAttrBase<"DenseI16ArrayAttr", "int16_t", "i16">;
+def Shard_ShardShapeAttr : DenseArrayAttrBase<"DenseI64ArrayAttr", "int64_t", "i64">;
//===----------------------------------------------------------------------===//
-// Mesh Enums.
+// Shard Enums.
//===----------------------------------------------------------------------===//
-def Mesh_ReductionKind : I32EnumAttr<"ReductionKind",
- "Reduction of an iterator/mesh dimension.", [
+def Shard_ReductionKind : I32EnumAttr<"ReductionKind",
+ "Reduction of an iterator/grid dimension.", [
I32EnumAttrCase<"Sum", 1, "sum">,
I32EnumAttrCase<"Max", 2, "max">,
I32EnumAttrCase<"Min", 3, "min">,
@@ -58,31 +58,31 @@ def Mesh_ReductionKind : I32EnumAttr<"ReductionKind",
I32EnumAttrCase<"Generic", 100, "generic">
]> {
let genSpecializedAttr = 0;
- let cppNamespace = "::mlir::mesh";
+ let cppNamespace = "::mlir::shard";
}
-def Mesh_ReductionKindAttr : EnumAttr<Mesh_Dialect, Mesh_ReductionKind, "partial"> {
+def Shard_ReductionKindAttr : EnumAttr<Shard_Dialect, Shard_ReductionKind, "partial"> {
let assemblyFormat = "$value";
}
-class Mesh_Type<string name, string typeMnemonic, list<Trait> traits = [],
+class Shard_Type<string name, string typeMnemonic, list<Trait> traits = [],
string baseCppClass = "::mlir::Type">
- : TypeDef<Mesh_Dialect, name, traits, baseCppClass> {
+ : TypeDef<Shard_Dialect, name, traits, baseCppClass> {
let mnemonic = typeMnemonic;
}
-def Mesh_Sharding : Mesh_Type<"Sharding", "sharding"> {
+def Shard_Sharding : Shard_Type<"Sharding", "sharding"> {
let summary = "sharding definition";
let assemblyFormat = "";
}
//===----------------------------------------------------------------------===//
-// Mesh Attribute
+// Shard Attribute
//===----------------------------------------------------------------------===//
-def Mesh_MeshAxesArrayAttr : AttrDef<Mesh_Dialect, "MeshAxesArray"> {
+def Shard_GridAxesArrayAttr : AttrDef<Shard_Dialect, "GridAxesArray"> {
let mnemonic = "axisarray";
- let parameters = (ins ArrayRefParameter<"MeshAxesAttr">:$axes);
+ let parameters = (ins ArrayRefParameter<"GridAxesAttr">:$axes);
let assemblyFormat = "`[` $axes `]`";
let extraClassDeclaration = [{
size_t size() const { return getAxes().size(); }
@@ -91,4 +91,4 @@ def Mesh_MeshAxesArrayAttr : AttrDef<Mesh_Dialect, "MeshAxesArray"> {
}];
}
-#endif // MLIR_DIALECT_MESH_IR_MESHBASE_TD
+#endif // MLIR_DIALECT_SHARD_IR_SHARDBASE_TD
diff --git a/mlir/include/mlir/Dialect/Mesh/IR/MeshDialect.h b/mlir/include/mlir/Dialect/Shard/IR/ShardDialect.h
similarity index 57%
rename from mlir/include/mlir/Dialect/Mesh/IR/MeshDialect.h
rename to mlir/include/mlir/Dialect/Shard/IR/ShardDialect.h
index a30cf91e851fe..4113a668d4b76 100644
--- a/mlir/include/mlir/Dialect/Mesh/IR/MeshDialect.h
+++ b/mlir/include/mlir/Dialect/Shard/IR/ShardDialect.h
@@ -1,4 +1,4 @@
-//===- MeshOps.h - Mesh Dialect ---------------------------------*- C++ -*-===//
+//===- ShardOps.h - Shard Dialect -------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
@@ -6,11 +6,11 @@
//
//===----------------------------------------------------------------------===//
-#ifndef MLIR_DIALECT_MESH_IR_MESHDIALECT_H
-#define MLIR_DIALECT_MESH_IR_MESHDIALECT_H
+#ifndef MLIR_DIALECT_SHARD_IR_SHARDDIALECT_H
+#define MLIR_DIALECT_SHARD_IR_SHARDDIALECT_H
#include "mlir/IR/Dialect.h"
-#include "mlir/Dialect/Mesh/IR/MeshDialect.h.inc"
+#include "mlir/Dialect/Shard/IR/ShardDialect.h.inc"
-#endif // MLIR_DIALECT_MESH_IR_MESHDIALECT_H
+#endif // MLIR_DIALECT_SHARD_IR_SHARDDIALECT_H
diff --git a/mlir/include/mlir/Dialect/Mesh/IR/MeshOps.h b/mlir/include/mlir/Dialect/Shard/IR/ShardOps.h
similarity index 52%
rename from mlir/include/mlir/Dialect/Mesh/IR/MeshOps.h
rename to mlir/include/mlir/Dialect/Shard/IR/ShardOps.h
index 7cfe59dd957ca.....
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only spot checked a bit but in general makes sense to me. 👍
Just three remarks, all related to the dialect changing name:
- I would prefer if the commit's title/short message mentions changing the dialect's name.
- Given that the dialect is changing name, its probably prudent to issue a PSA on the forums (just in case there are some users out there and to document that this is a thing that happened).
- Its probably nice to have a message near the top of the dialect's documentation page saying that this dialect used to be called "mesh". Just in case people see this "new" dialect show up and want to figure how it got in.
EDIT: PR can probably be tagged NFC as well.
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. No objections here or in Discourse, so should be fine.
But still good to wait a few more days, since it's a dialect renaming.
Thanks!
mlir/docs/Dialects/Shard.md
Outdated
are useful for representing sharding and communication on a device mesh | ||
cluster. | ||
This dialect contains a set of attributes, operations and interfaces that | ||
are useful for representing sharding and communication on a device grid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "sharding" of what?
Can we link to some definition or introduction to the concept?
mlir/docs/Dialects/Shard.md
Outdated
It is assumed that the user is familiar with collective operations. | ||
[Wikipedia](https://en.wikipedia.org/wiki/Collective_operation) has a good | ||
explanation. | ||
The main addition is that the collectives in this dialect have mesh | ||
The main addition is that the collectives in this dialect have grid | ||
semantics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we expand on what this means to "have grid semantics" as an addition to the collectives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should take this opportunity to improve the documentation as much as possible, in particular with reference to existing concepts. In particular since this is claiming that the original was a "misnomer", we should be able to anchor this in existing work out there.
Thank you @joker-eph for your suggestion regarding improved documentation—that's a good idea. The motivation for renaming the dialect is stated above: the dialect is not fundamentally about "meshes." In computer science, "mesh" typically refers to either mesh data structures or mesh networks. To my knowledge, the dialect was inspired by GSPMD, which also talks about "sharding annotations", so it may be appropriate to mention the relevant paper in the dialect documentation. I will add a reference accordingly. As far as I'm aware, the term "spmdization" is only used in OpenMP, and even there it refers to a related but distinct concept. Furthermore, searching for "spmdize" yields no results outside of MLIR. The more widely adopted term for this concept is partitioning—as seen in GSPMD and Shardy. Therefore, citing GSPMD may be sufficient to get the suggested anchor. The term mesh is indeed used in various contexts to describe device networks. For example, PyTorch uses it to refer to device groups. However, in those cases as well, the devices form a rectangular grid. The same reasoning we used to justify renaming operations and symbol names applies here. That said, I wouldn’t object if there were broader consensus around retaining the term mesh to refer specifically to the device network within the shard dialect. |
- dialect name mesh -> shard - (device) mesh -> (device) grid - spmdize -> partition
0c80f00
to
6f504c5
Compare
I updated Shard.md to address @joker-eph 's concerns above. Further fixes of doc strings etc can be done in a separate PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fschlimb !
[mlir][mesh,shard] Fixing misnomers in mesh dialect, renaming 'mesh' dialect to 'shard' (discourse 87053)
A lot of diffs, but simple renames only.
@tkarna @yaochengji