-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[AArch64] Add FeatureZCRegMoveFPR128 subtarget feature #148427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Add FeatureZCRegMoveFPR128 subtarget feature #148427
Conversation
Created using spr 1.3.6
@llvm/pr-subscribers-backend-aarch64 Author: Tomer Shafir (tomershafir) ChangesAdds a subtarget feature called Full diff: https://github.com/llvm/llvm-project/pull/148427.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index 9973df865ea17..d6f9d7a8b8941 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -627,6 +627,9 @@ def FeatureZCRegMoveFPR64 : SubtargetFeature<"zcm-fpr64", "HasZeroCycleRegMoveFP
def FeatureZCRegMoveFPR32 : SubtargetFeature<"zcm-fpr32", "HasZeroCycleRegMoveFPR32", "true",
"Has zero-cycle register moves for FPR32 registers">;
+def FeatureZCRegMoveFPR128 : SubtargetFeature<"zcm-fpr128", "HasZeroCycleRegMoveFPR128", "true",
+ "Has zero-cycle register moves for FPR128 registers">;
+
def FeatureZCZeroingGP : SubtargetFeature<"zcz-gp", "HasZeroCycleZeroingGP", "true",
"Has zero-cycle zeroing instructions for generic registers">;
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index 5379305bc7a7f..172b840d17ee2 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -313,6 +313,7 @@ def TuneAppleA7 : SubtargetFeature<"apple-a7", "ARMProcFamily", "AppleA7",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing,
FeatureZCZeroingFPWorkaround]>;
@@ -327,6 +328,7 @@ def TuneAppleA10 : SubtargetFeature<"apple-a10", "ARMProcFamily", "AppleA10",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing]>;
def TuneAppleA11 : SubtargetFeature<"apple-a11", "ARMProcFamily", "AppleA11",
@@ -340,6 +342,7 @@ def TuneAppleA11 : SubtargetFeature<"apple-a11", "ARMProcFamily", "AppleA11",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing]>;
def TuneAppleA12 : SubtargetFeature<"apple-a12", "ARMProcFamily", "AppleA12",
@@ -353,6 +356,7 @@ def TuneAppleA12 : SubtargetFeature<"apple-a12", "ARMProcFamily", "AppleA12",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing]>;
def TuneAppleA13 : SubtargetFeature<"apple-a13", "ARMProcFamily", "AppleA13",
@@ -366,6 +370,7 @@ def TuneAppleA13 : SubtargetFeature<"apple-a13", "ARMProcFamily", "AppleA13",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing]>;
def TuneAppleA14 : SubtargetFeature<"apple-a14", "ARMProcFamily", "AppleA14",
@@ -384,6 +389,7 @@ def TuneAppleA14 : SubtargetFeature<"apple-a14", "ARMProcFamily", "AppleA14",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing]>;
def TuneAppleA15 : SubtargetFeature<"apple-a15", "ARMProcFamily", "AppleA15",
@@ -402,6 +408,7 @@ def TuneAppleA15 : SubtargetFeature<"apple-a15", "ARMProcFamily", "AppleA15",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing]>;
def TuneAppleA16 : SubtargetFeature<"apple-a16", "ARMProcFamily", "AppleA16",
@@ -420,6 +427,7 @@ def TuneAppleA16 : SubtargetFeature<"apple-a16", "ARMProcFamily", "AppleA16",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing]>;
def TuneAppleA17 : SubtargetFeature<"apple-a17", "ARMProcFamily", "AppleA17",
@@ -438,6 +446,7 @@ def TuneAppleA17 : SubtargetFeature<"apple-a17", "ARMProcFamily", "AppleA17",
FeatureStorePairSuppress,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing]>;
def TuneAppleM4 : SubtargetFeature<"apple-m4", "ARMProcFamily", "AppleM4",
@@ -455,6 +464,7 @@ def TuneAppleM4 : SubtargetFeature<"apple-m4", "ARMProcFamily", "AppleM4",
FeatureFuseLiterals,
FeatureZCRegMoveGPR64,
FeatureZCRegMoveFPR64,
+ FeatureZCRegMoveFPR128,
FeatureZCZeroing
]>;
|
This is part of a patch series: |
Retreating back to a single commit patch for all of the changes, as the stacked PR is hard to operate. |
This patch aims to improve utilization of zero cycle instructions based on subtarget support: - Adds a subtarget feature called `FeatureZCRegMoveFPR128` that enables to query wether the target supports zero cycle reg move for FPR128 NEON registers, and embeds it into the appropriate processors. - Adds 2 subtarget hooks `canLowerToZeroCycleRegMove` and `canLowerToZeroCycleRegZeroing` to enable query if an instruction can be lowered to a zero cycle instruction. The logic depends on the microarchitecture. This patch also provide an implementation for AArch64 based on `AArch64InstrInfo::copyPhysReg` which supports both physical and virtual registers. - Adds a target hook `shouldReMaterializeTrivialRegDef` that enables target to specify wether rematerialization of the copy is beneficial. This patch also provide an implementation for AArch64 based on the new subtarget hooks `canLowerToZeroCycleReg[Move|Zeroing]`. - This change makes the register coalescer prevent rematerialization of a trivial def for a move instruction, if the target guides against it, based on the new target hook `shouldReMaterializeTrivialRegDef`. The filter is appended to the exiting logic. The patch includes isolated MIR tests for all register classes supported, and fixes existing tests. Previous stacked PRs: - llvm#148427 - llvm#148428 - llvm#148429 - llvm#148430
Adds a subtarget feature called
FeatureZCRegMoveFPR128
that enables to query wether the target supports zero cycle reg move for FPR128 NEON registers, and embeds it into the appropriate processors. It prepares for a register coalescer optimization to prevent rematerialization of moves where the target supports ZCM.