Skip to content

Conversation

@dcandler
Copy link
Collaborator

@dcandler dcandler commented Dec 8, 2025

This patch adds initial support for the Arm v9.3 C1 processors:

  • C1-Nano
  • C1-Pro
  • C1-Premium
  • C1-Ultra

For more information on each, see:
https://developer.arm.com/Processors/C1-Nano
https://developer.arm.com/Processors/C1-Pro
https://developer.arm.com/Processors/C1-Premium
https://developer.arm.com/Processors/C1-Ultra

Technical Reference Manual for C1-Nano:
https://developer.arm.com/documentation/107753/latest/

Technical Reference Manual for C1-Pro:
https://developer.arm.com/documentation/107771/latest/

Technical Reference Manual for C1-Premium:
https://developer.arm.com/documentation/109416/latest/

Technical Reference Manual for C1-Ultra:
https://developer.arm.com/documentation/108014/latest/

@llvmbot llvmbot added backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Dec 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Dec 8, 2025

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-backend-aarch64

Author: None (dcandler)

Changes

This patch adds initial support for the Arm v9.3 C1 processors:

  • C1-Nano
  • C1-Pro
  • C1-Premium
  • C1-Ultra

For more information on each, see:
https://developer.arm.com/Processors/C1-Nano
https://developer.arm.com/Processors/C1-Pro
https://developer.arm.com/Processors/C1-Premium
https://developer.arm.com/Processors/C1-Ultra

Technical Reference Manual for C1-Nano:
https://developer.arm.com/documentation/107753/latest/

Technical Reference Manual for C1-Pro:
https://developer.arm.com/documentation/107771/latest/

Technical Reference Manual for C1-Premium:
https://developer.arm.com/documentation/109416/latest/

Technical Reference Manual for C1-Ultra:
https://developer.arm.com/documentation/108014/latest/


Patch is 49.71 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/171124.diff

13 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+5)
  • (modified) clang/test/Driver/aarch64-mcpu.c (+8)
  • (added) clang/test/Driver/print-enabled-extensions/aarch64-c1-nano.c (+69)
  • (added) clang/test/Driver/print-enabled-extensions/aarch64-c1-premium.c (+71)
  • (added) clang/test/Driver/print-enabled-extensions/aarch64-c1-pro.c (+71)
  • (added) clang/test/Driver/print-enabled-extensions/aarch64-c1-ultra.c (+71)
  • (modified) clang/test/Misc/target-invalid-cpu-note/aarch64.c (+4)
  • (modified) llvm/docs/ReleaseNotes.md (+2)
  • (modified) llvm/lib/Target/AArch64/AArch64Processors.td (+107)
  • (modified) llvm/lib/Target/AArch64/AArch64Subtarget.cpp (+4)
  • (modified) llvm/lib/TargetParser/Host.cpp (+4)
  • (modified) llvm/unittests/TargetParser/Host.cpp (+12)
  • (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+5-1)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index acca997e0ff64..e36a4c64965cb 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -631,6 +631,11 @@ X86 Support
 
 Arm and AArch64 Support
 ^^^^^^^^^^^^^^^^^^^^^^^
+- Support has been added for the following processors (command-line identifiers in parentheses):
+  - Arm C1-Nano (``c1-nano``)
+  - Arm C1-Pro (``c1-pro``)
+  - Arm C1-Premium (``c1-premium``)
+  - Arm C1-Ultra (``c1-ultra``)
 - More intrinsics for the following AArch64 instructions:
   FCVTZ[US], FCVTN[US], FCVTM[US], FCVTP[US], FCVTA[US]
 
diff --git a/clang/test/Driver/aarch64-mcpu.c b/clang/test/Driver/aarch64-mcpu.c
index 447ee4bd3a6f9..fdf2e4011487a 100644
--- a/clang/test/Driver/aarch64-mcpu.c
+++ b/clang/test/Driver/aarch64-mcpu.c
@@ -84,6 +84,14 @@
 // CORTEX-A520: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a520"
 // RUN: %clang --target=aarch64 -mcpu=cortex-a520ae -### -c %s 2>&1 | FileCheck -check-prefix=CORTEX-A520AE %s
 // CORTEX-A520AE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-a520ae"
+// RUN: %clang --target=aarch64 -mcpu=c1-nano -### -c %s 2>&1 | FileCheck -check-prefix=C1-NANO %s
+// C1-NANO: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "c1-nano"
+// RUN: %clang --target=aarch64 -mcpu=c1-pro -### -c %s 2>&1 | FileCheck -check-prefix=C1-PRO %s
+// C1-PRO: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "c1-pro"
+// RUN: %clang --target=aarch64 -mcpu=c1-premium -### -c %s 2>&1 | FileCheck -check-prefix=C1-PREMIUM %s
+// C1-PREMIUM: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "c1-premium"
+// RUN: %clang --target=aarch64 -mcpu=c1-ultra -### -c %s 2>&1 | FileCheck -check-prefix=C1-ULTRA %s
+// C1-ULTRA: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "c1-ultra"
 
 // RUN: %clang --target=aarch64 -mcpu=cortex-r82  -### -c %s 2>&1 | FileCheck -check-prefix=CORTEXR82 %s
 // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82"
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-c1-nano.c b/clang/test/Driver/print-enabled-extensions/aarch64-c1-nano.c
new file mode 100644
index 0000000000000..33112527c9add
--- /dev/null
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-c1-nano.c
@@ -0,0 +1,69 @@
+// REQUIRES: aarch64-registered-target
+// RUN: %clang --target=aarch64 --print-enabled-extensions -mcpu=c1-nano | FileCheck --strict-whitespace --implicit-check-not=FEAT_ %s
+
+// CHECK: Extensions enabled for the given AArch64 target
+// CHECK-EMPTY:
+// CHECK-NEXT:     Architecture Feature(s)                                Description
+// CHECK-NEXT:     FEAT_AMUv1                                             Enable Armv8.4-A Activity Monitors extension
+// CHECK-NEXT:     FEAT_AMUv1p1                                           Enable Armv8.6-A Activity Monitors Virtualization support
+// CHECK-NEXT:     FEAT_AdvSIMD                                           Enable Advanced SIMD instructions
+// CHECK-NEXT:     FEAT_BF16                                              Enable BFloat16 Extension
+// CHECK-NEXT:     FEAT_BTI                                               Enable Branch Target Identification
+// CHECK-NEXT:     FEAT_CCIDX                                             Enable Armv8.3-A Extend of the CCSIDR number of sets
+// CHECK-NEXT:     FEAT_CHK                                               Enable Armv8.0-A Check Feature Status Extension
+// CHECK-NEXT:     FEAT_CLRBHB                                            Enable Clear BHB instruction
+// CHECK-NEXT:     FEAT_CRC32                                             Enable Armv8.0-A CRC-32 checksum instructions
+// CHECK-NEXT:     FEAT_CSV2_2                                            Enable architectural speculation restriction
+// CHECK-NEXT:     FEAT_DIT                                               Enable Armv8.4-A Data Independent Timing instructions
+// CHECK-NEXT:     FEAT_DPB                                               Enable Armv8.2-A data Cache Clean to Point of Persistence
+// CHECK-NEXT:     FEAT_DPB2                                              Enable Armv8.5-A Cache Clean to Point of Deep Persistence
+// CHECK-NEXT:     FEAT_DotProd                                           Enable dot product support
+// CHECK-NEXT:     FEAT_ECV                                               Enable enhanced counter virtualization extension
+// CHECK-NEXT:     FEAT_ETE                                               Enable Embedded Trace Extension
+// CHECK-NEXT:     FEAT_FCMA                                              Enable Armv8.3-A Floating-point complex number support
+// CHECK-NEXT:     FEAT_FGT                                               Enable fine grained virtualization traps extension
+// CHECK-NEXT:     FEAT_FHM                                               Enable FP16 FML instructions
+// CHECK-NEXT:     FEAT_FP                                                Enable Armv8.0-A Floating Point Extensions
+// CHECK-NEXT:     FEAT_FP16                                              Enable half-precision floating-point data processing
+// CHECK-NEXT:     FEAT_FPAC                                              Enable Armv8.3-A Pointer Authentication Faulting enhancement
+// CHECK-NEXT:     FEAT_FRINTTS                                           Enable FRInt[32|64][Z|X] instructions that round a floating-point number to an integer (in FP format) forcing it to fit into a 32- or 64-bit int
+// CHECK-NEXT:     FEAT_FlagM                                             Enable Armv8.4-A Flag Manipulation instructions
+// CHECK-NEXT:     FEAT_FlagM2                                            Enable alternative NZCV format for floating point comparisons
+// CHECK-NEXT:     FEAT_HBC                                               Enable Armv8.8-A Hinted Conditional Branches Extension
+// CHECK-NEXT:     FEAT_HCX                                               Enable Armv8.7-A HCRX_EL2 system register
+// CHECK-NEXT:     FEAT_I8MM                                              Enable Matrix Multiply Int8 Extension
+// CHECK-NEXT:     FEAT_JSCVT                                             Enable Armv8.3-A JavaScript FP conversion instructions
+// CHECK-NEXT:     FEAT_LOR                                               Enable Armv8.1-A Limited Ordering Regions extension
+// CHECK-NEXT:     FEAT_LRCPC                                             Enable support for RCPC extension
+// CHECK-NEXT:     FEAT_LRCPC2                                            Enable Armv8.4-A RCPC instructions with Immediate Offsets
+// CHECK-NEXT:     FEAT_LRCPC3                                            Enable Armv8.9-A RCPC instructions for A64 and Advanced SIMD and floating-point instruction set
+// CHECK-NEXT:     FEAT_LSE                                               Enable Armv8.1-A Large System Extension (LSE) atomic instructions
+// CHECK-NEXT:     FEAT_LSE2                                              Enable Armv8.4-A Large System Extension 2 (LSE2) atomicity rules
+// CHECK-NEXT:     FEAT_MOPS                                              Enable Armv8.8-A memcpy and memset acceleration instructions
+// CHECK-NEXT:     FEAT_MPAM                                              Enable Armv8.4-A Memory system Partitioning and Monitoring extension
+// CHECK-NEXT:     FEAT_MTE, FEAT_MTE2                                    Enable Memory Tagging Extension
+// CHECK-NEXT:     FEAT_NMI, FEAT_GICv3_NMI                               Enable Armv8.8-A Non-maskable Interrupts
+// CHECK-NEXT:     FEAT_NV, FEAT_NV2                                      Enable Armv8.4-A Nested Virtualization Enchancement
+// CHECK-NEXT:     FEAT_PAN                                               Enable Armv8.1-A Privileged Access-Never extension
+// CHECK-NEXT:     FEAT_PAN2                                              Enable Armv8.2-A PAN s1e1R and s1e1W Variants
+// CHECK-NEXT:     FEAT_PAuth                                             Enable Armv8.3-A Pointer Authentication extension
+// CHECK-NEXT:     FEAT_PMUv3                                             Enable Armv8.0-A PMUv3 Performance Monitors extension
+// CHECK-NEXT:     FEAT_RAS, FEAT_RASv1p1                                 Enable Armv8.0-A Reliability, Availability and Serviceability Extensions
+// CHECK-NEXT:     FEAT_RDM                                               Enable Armv8.1-A Rounding Double Multiply Add/Subtract instructions
+// CHECK-NEXT:     FEAT_SB                                                Enable Armv8.5-A Speculation Barrier
+// CHECK-NEXT:     FEAT_SEL2                                              Enable Armv8.4-A Secure Exception Level 2 extension
+// CHECK-NEXT:     FEAT_SME                                               Enable Scalable Matrix Extension (SME)
+// CHECK-NEXT:     FEAT_SME2                                              Enable Scalable Matrix Extension 2 (SME2) instructions
+// CHECK-NEXT:     FEAT_SPECRES                                           Enable Armv8.5-A execution and data prediction invalidation instructions
+// CHECK-NEXT:     FEAT_SPECRES2                                          Enable Speculation Restriction Instruction
+// CHECK-NEXT:     FEAT_SSBS, FEAT_SSBS2                                  Enable Speculative Store Bypass Safe bit
+// CHECK-NEXT:     FEAT_SVE                                               Enable Scalable Vector Extension (SVE) instructions
+// CHECK-NEXT:     FEAT_SVE2                                              Enable Scalable Vector Extension 2 (SVE2) instructions
+// CHECK-NEXT:     FEAT_SVE_BitPerm                                       Enable bit permutation SVE2 instructions
+// CHECK-NEXT:     FEAT_TLBIOS, FEAT_TLBIRANGE                            Enable Armv8.4-A TLB Range and Maintenance instructions
+// CHECK-NEXT:     FEAT_TRBE                                              Enable Trace Buffer Extension
+// CHECK-NEXT:     FEAT_TRF                                               Enable Armv8.4-A Trace extension
+// CHECK-NEXT:     FEAT_UAO                                               Enable Armv8.2-A UAO PState
+// CHECK-NEXT:     FEAT_VHE                                               Enable Armv8.1-A Virtual Host extension
+// CHECK-NEXT:     FEAT_WFxT                                              Enable Armv8.7-A WFET and WFIT instruction
+// CHECK-NEXT:     FEAT_XS                                                Enable Armv8.7-A limited-TLB-maintenance instruction
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-c1-premium.c b/clang/test/Driver/print-enabled-extensions/aarch64-c1-premium.c
new file mode 100644
index 0000000000000..3b146e36f81a2
--- /dev/null
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-c1-premium.c
@@ -0,0 +1,71 @@
+// REQUIRES: aarch64-registered-target
+// RUN: %clang --target=aarch64 --print-enabled-extensions -mcpu=c1-premium | FileCheck --strict-whitespace --implicit-check-not=FEAT_ %s
+
+// CHECK: Extensions enabled for the given AArch64 target
+// CHECK-EMPTY:
+// CHECK-NEXT:     Architecture Feature(s)                                Description
+// CHECK-NEXT:     FEAT_AMUv1                                             Enable Armv8.4-A Activity Monitors extension
+// CHECK-NEXT:     FEAT_AMUv1p1                                           Enable Armv8.6-A Activity Monitors Virtualization support
+// CHECK-NEXT:     FEAT_AdvSIMD                                           Enable Advanced SIMD instructions
+// CHECK-NEXT:     FEAT_BF16                                              Enable BFloat16 Extension
+// CHECK-NEXT:     FEAT_BTI                                               Enable Branch Target Identification
+// CHECK-NEXT:     FEAT_CCIDX                                             Enable Armv8.3-A Extend of the CCSIDR number of sets
+// CHECK-NEXT:     FEAT_CHK                                               Enable Armv8.0-A Check Feature Status Extension
+// CHECK-NEXT:     FEAT_CLRBHB                                            Enable Clear BHB instruction
+// CHECK-NEXT:     FEAT_CRC32                                             Enable Armv8.0-A CRC-32 checksum instructions
+// CHECK-NEXT:     FEAT_CSV2_2                                            Enable architectural speculation restriction
+// CHECK-NEXT:     FEAT_DIT                                               Enable Armv8.4-A Data Independent Timing instructions
+// CHECK-NEXT:     FEAT_DPB                                               Enable Armv8.2-A data Cache Clean to Point of Persistence
+// CHECK-NEXT:     FEAT_DPB2                                              Enable Armv8.5-A Cache Clean to Point of Deep Persistence
+// CHECK-NEXT:     FEAT_DotProd                                           Enable dot product support
+// CHECK-NEXT:     FEAT_ECV                                               Enable enhanced counter virtualization extension
+// CHECK-NEXT:     FEAT_ETE                                               Enable Embedded Trace Extension
+// CHECK-NEXT:     FEAT_FCMA                                              Enable Armv8.3-A Floating-point complex number support
+// CHECK-NEXT:     FEAT_FGT                                               Enable fine grained virtualization traps extension
+// CHECK-NEXT:     FEAT_FHM                                               Enable FP16 FML instructions
+// CHECK-NEXT:     FEAT_FP                                                Enable Armv8.0-A Floating Point Extensions
+// CHECK-NEXT:     FEAT_FP16                                              Enable half-precision floating-point data processing
+// CHECK-NEXT:     FEAT_FPAC                                              Enable Armv8.3-A Pointer Authentication Faulting enhancement
+// CHECK-NEXT:     FEAT_FRINTTS                                           Enable FRInt[32|64][Z|X] instructions that round a floating-point number to an integer (in FP format) forcing it to fit into a 32- or 64-bit int
+// CHECK-NEXT:     FEAT_FlagM                                             Enable Armv8.4-A Flag Manipulation instructions
+// CHECK-NEXT:     FEAT_FlagM2                                            Enable alternative NZCV format for floating point comparisons
+// CHECK-NEXT:     FEAT_HBC                                               Enable Armv8.8-A Hinted Conditional Branches Extension
+// CHECK-NEXT:     FEAT_HCX                                               Enable Armv8.7-A HCRX_EL2 system register
+// CHECK-NEXT:     FEAT_I8MM                                              Enable Matrix Multiply Int8 Extension
+// CHECK-NEXT:     FEAT_JSCVT                                             Enable Armv8.3-A JavaScript FP conversion instructions
+// CHECK-NEXT:     FEAT_LOR                                               Enable Armv8.1-A Limited Ordering Regions extension
+// CHECK-NEXT:     FEAT_LRCPC                                             Enable support for RCPC extension
+// CHECK-NEXT:     FEAT_LRCPC2                                            Enable Armv8.4-A RCPC instructions with Immediate Offsets
+// CHECK-NEXT:     FEAT_LRCPC3                                            Enable Armv8.9-A RCPC instructions for A64 and Advanced SIMD and floating-point instruction set
+// CHECK-NEXT:     FEAT_LSE                                               Enable Armv8.1-A Large System Extension (LSE) atomic instructions
+// CHECK-NEXT:     FEAT_LSE2                                              Enable Armv8.4-A Large System Extension 2 (LSE2) atomicity rules
+// CHECK-NEXT:     FEAT_MOPS                                              Enable Armv8.8-A memcpy and memset acceleration instructions
+// CHECK-NEXT:     FEAT_MPAM                                              Enable Armv8.4-A Memory system Partitioning and Monitoring extension
+// CHECK-NEXT:     FEAT_MTE, FEAT_MTE2                                    Enable Memory Tagging Extension
+// CHECK-NEXT:     FEAT_NMI, FEAT_GICv3_NMI                               Enable Armv8.8-A Non-maskable Interrupts
+// CHECK-NEXT:     FEAT_NV, FEAT_NV2                                      Enable Armv8.4-A Nested Virtualization Enchancement
+// CHECK-NEXT:     FEAT_PAN                                               Enable Armv8.1-A Privileged Access-Never extension
+// CHECK-NEXT:     FEAT_PAN2                                              Enable Armv8.2-A PAN s1e1R and s1e1W Variants
+// CHECK-NEXT:     FEAT_PAuth                                             Enable Armv8.3-A Pointer Authentication extension
+// CHECK-NEXT:     FEAT_PMUv3                                             Enable Armv8.0-A PMUv3 Performance Monitors extension
+// CHECK-NEXT:     FEAT_RAS, FEAT_RASv1p1                                 Enable Armv8.0-A Reliability, Availability and Serviceability Extensions
+// CHECK-NEXT:     FEAT_RDM                                               Enable Armv8.1-A Rounding Double Multiply Add/Subtract instructions
+// CHECK-NEXT:     FEAT_SB                                                Enable Armv8.5-A Speculation Barrier
+// CHECK-NEXT:     FEAT_SEL2                                              Enable Armv8.4-A Secure Exception Level 2 extension
+// CHECK-NEXT:     FEAT_SME                                               Enable Scalable Matrix Extension (SME)
+// CHECK-NEXT:     FEAT_SME2                                              Enable Scalable Matrix Extension 2 (SME2) instructions
+// CHECK-NEXT:     FEAT_SPE                                               Enable Statistical Profiling extension
+// CHECK-NEXT:     FEAT_SPECRES                                           Enable Armv8.5-A execution and data prediction invalidation instructions
+// CHECK-NEXT:     FEAT_SPECRES2                                          Enable Speculation Restriction Instruction
+// CHECK-NEXT:     FEAT_SPEv1p2                                           Enable extra register in the Statistical Profiling Extension
+// CHECK-NEXT:     FEAT_SSBS, FEAT_SSBS2                                  Enable Speculative Store Bypass Safe bit
+// CHECK-NEXT:     FEAT_SVE                                               Enable Scalable Vector Extension (SVE) instructions
+// CHECK-NEXT:     FEAT_SVE2                                              Enable Scalable Vector Extension 2 (SVE2) instructions
+// CHECK-NEXT:     FEAT_SVE_BitPerm                                       Enable bit permutation SVE2 instructions
+// CHECK-NEXT:     FEAT_TLBIOS, FEAT_TLBIRANGE                            Enable Armv8.4-A TLB Range and Maintenance instructions
+// CHECK-NEXT:     FEAT_TRBE                                              Enable Trace Buffer Extension
+// CHECK-NEXT:     FEAT_TRF                                               Enable Armv8.4-A Trace extension
+// CHECK-NEXT:     FEAT_UAO                                               Enable Armv8.2-A UAO PState
+// CHECK-NEXT:     FEAT_VHE                                               Enable Armv8.1-A Virtual Host extension
+// CHECK-NEXT:     FEAT_WFxT                                              Enable Armv8.7-A WFET and WFIT instruction
+// CHECK-NEXT:     FEAT_XS                                                Enable Armv8.7-A limited-TLB-maintenance instruction
diff --git a/clang/test/Driver/print-enabled-extensions/aarch64-c1-pro.c b/clang/test/Driver/print-enabled-extensions/aarch64-c1-pro.c
new file mode 100644
index 0000000000000..d31a598463267
--- /dev/null
+++ b/clang/test/Driver/print-enabled-extensions/aarch64-c1-pro.c
@@ -0,0 +1,71 @@
+// REQUIRES: aarch64-registered-target
+// RUN: %clang --target=aarch64 --print-enabled-extensions -mcpu=c1-pro | FileCheck --strict-whitespace --implicit-check-not=FEAT_ %s
+
+// CHECK: Extensions enabled for the given AArch64 target
+// CHECK-EMPTY:
+// CHECK-NEXT:     Architecture Feature(s)                                Description
+// CHECK-NEXT:     FEAT_AMUv1                                             Enable Armv8.4-A Activity Monitors extension
+// CHECK-NEXT:     FEAT_AMUv1p1                                           Enable Armv8.6-A Activity Monitors Virtualization support
+// CHECK-NEXT:     FEAT_AdvSIMD                                           Enable Advanced SIMD instructions
+// CHECK-NEXT:     FEAT_BF16                                              Enable BFloat16 Extension
+// CHECK-NEXT:     FEAT_BTI                ...
[truncated]

@Andarwinux
Copy link
Contributor

It looks like all C1 series enabled SME and MOPS unconditionally. But according to Arm, only C1-Ultra/Premium are forced with SME support, and MOPS may cause performance degradation.

https://developer.arm.com/documentation/111076/0100
https://developer.arm.com/documentation/107753/0001/The-C1-Nano--core/C1-Nano--core-features

The C1-SME2unit is optional, unless the cluster includes an ultimate-performance core. If the C1-SME2unit is not implemented, SME and SME2 are not supported. For more information about configuring the C1-SME2 unit, see the Arm® C1-Scalable Matrix Extension 2 Configuration and Integration Manual and the RTL configuration process section in the Arm® C1-DynamIQ™ Shared Unit Configuration and Integration Manual.

https://developer.arm.com/documentation/111077/8-0

Under certain micro-architectural conditions, when the Processing Element (PE) is executing
FEAT_MOPS instructions, performance might be degraded. This is due to micro-architectural flushes that
occur due to read-after-write hazards or hardware prefetch ineffectively caching contiguous accesses.

@dcandler
Copy link
Collaborator Author

dcandler commented Dec 9, 2025

It looks like all C1 series enabled SME and MOPS unconditionally. But according to Arm, only C1-Ultra/Premium are forced with SME support, and MOPS may cause performance degradation.

My understanding is that the convention of the -mcpu option is to enable all optional and mandatory features. The C1-Nano and Pro can be configured with SME, so the compiler should be able to support it, while MOPS is mandatory from v8.8.

Copy link
Contributor

@jthackray jthackray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes. LGTM now.

@Andarwinux
Copy link
Contributor

It looks like all C1 series enabled SME and MOPS unconditionally. But according to Arm, only C1-Ultra/Premium are forced with SME support, and MOPS may cause performance degradation.

My understanding is that the convention of the -mcpu option is to enable all optional and mandatory features. The C1-Nano and Pro can be configured with SME, so the compiler should be able to support it, while MOPS is mandatory from v8.8.

As far as I know, -mcpu=c1-nano did 't enable optional crypto by default unless +crypto is appended. While SME is also optional for C1-Nano, but is always enabled, which feels very inconsistent. And if LLVM implemented SME auto-vectorization in the future, binaries compiled with -mcpu=c1-nano won't be able to execute on non-SME-configured C1-Nano.

As for MOPS, I'm just concerned that -mcpu=c1-* will result in memcpy and memset to be inlined unconditionally, which according to Arm's errata will result in performance degradation, and thus be inconsistent with the user's intended purpose of maximum performance.

@statham-arm
Copy link
Collaborator

As for MOPS, I'm just concerned that -mcpu=c1-* will result in memcpy and memset to be inlined unconditionally, which according to Arm's errata will result in performance degradation,

On the other hand, if someone explicitly writes a MOPS instruction in assembly language, it shouldn't fail to assemble. The questions of "does the CPU understand this instruction?" and "is it a good idea to use it in code generation?" are conceptually separate.

@smithp35
Copy link
Collaborator

smithp35 commented Dec 9, 2025

We do have a convention for CPUs, agreed with the Arm GNU team as we want CPU features to be consistent as possible across toolchains, that for each Arm CPU we enable all optional features (like SME), with crypto being the exception of always being opt-in. This is partly historical as GCC has always modelled it that way and crypto extensions are subject to export control.

I understand that a lot of this will look inconsistent externally.

As statham-arm mentions we want to separate what features the CPU supports from whether it is the right thing to do to make use of them.

@Andarwinux
Copy link
Contributor

As for MOPS, I'm just concerned that -mcpu=c1-* will result in memcpy and memset to be inlined unconditionally, which according to Arm's errata will result in performance degradation,

On the other hand, if someone explicitly writes a MOPS instruction in assembly language, it shouldn't fail to assemble. The questions of "does the CPU understand this instruction?" and "is it a good idea to use it in code generation?" are conceptually separate.

But at the moment llvm doesn't seem to be able to distinguish between the two. I think a tune option could be added to avoid preferring MOPS in code generation, similar to x86's “FeatureFSRM”.

We do have a convention for CPUs, agreed with the Arm GNU team as we want CPU features to be consistent as possible across toolchains, that for each Arm CPU we enable all optional features (like SME), with crypto being the exception of always being opt-in. This is partly historical as GCC has always modelled it that way and crypto extensions are subject to export control.

I understand that a lot of this will look inconsistent externally.

As statham-arm mentions we want to separate what features the CPU supports from whether it is the right thing to do to make use of them.

Well, that makes sense. And it seems unlikely that there will be actual products that indeed don't have C1-SME2.

But I'm still concerned that this will cause people to avoid using -mcpu=c1-*, just as Qualcomm's SoCs with Armv9 Cortex disabled SVE, so that -mcpu=cortex-* binaries can't be executed on cortex-*.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the features and they look OK to me, using a comparison to the previous generation. RPRFM should be added to these cpus too, but that is only being added in #170490.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl'

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants