merge main into amd-staging #567

ronlieb · 2025-11-11T23:59:41Z

No description provided.

…nt matching and inference and create clusters (llvm#165868) Adding Matching and Inference Functionality to Propeller. For detailed information, please refer to the following RFC: https://discourse.llvm.org/t/rfc-adding-matching-and-inference-functionality-to-propeller/86238. This is the fourth PR, which is used to implement matching and inference and create the clusters. The associated PRs are: PR1: llvm#160706 PR2: llvm#162963 PR3: llvm#164223 co-authors: lifengxiang1025 [[email protected]](mailto:[email protected]); zcfh [[email protected]](mailto:[email protected]) Co-authored-by: lifengxiang1025 <[email protected]> Co-authored-by: zcfh <[email protected]>

This patch adds a new `FramePointerKind::NonLeafNoReserve` and makes it the default for `-momit-leaf-frame-pointer`. It also adds a new commandline option `-m[no-]reserve-frame-pointer-reg`. This should fix llvm#154379, the main impact of this patch can be found in `clang/lib/Driver/ToolChains/CommonArgs.cpp`.

…ing pass: `arith-to-apfloat` (llvm#166618)" (llvm#167431)"" (llvm#167549) Reverts llvm#167436 to fix sanitizers

… G_GLOBAL_VALUE (llvm#165340)

llvm#165198) Asan test `ThreadedStressStackReuseTest ` fails on AIX due to smaller default thread stack size. Set thread stack size to a minimum of 128KB to ensure reliable test behavior across platforms (platforms with smaller default thread stack size). --------- Co-authored-by: Riyaz Ahmad <[email protected]>

Simplify `createReadOrMaskedRead` to only require _one_ argument to specify the vector type to read (passed as `VectorType`) instead of passing vector-sizes and scalable-flags independently (i.e. _two_ arguments). A simple overload is provided for users that wouldn't re-use the corresponding `VectorType` (and hence there's no point for them to create). While there are no users upstream for this overload, it's been helpful downstream.

These tests fail in the profcheck configuration because profinject gets added to the pipeline and adds metadata that changes the input PGO information.

…vm#166901) Now that llvm#166517 has landed and [Writer](https://github.com/llvm/llvm-project/blob/main/libc/src/stdio/printf_core/writer.h#L130) has been refactored to track bytes written as size_t, strftime can be refactored as well to handle size_t return values. Can't think of a proper way to test this without creating a 2GB+ string, but existing tests cover most cases.

Implements fchown fixes: llvm#166856

Closes llvm#161461 - This is my first time contributing to libc's POSIX, so for reference I used `clock_gettime` implementation for Linux. For convenience, here is the description of `clock_settime` function [behavior](https://www.man7.org/linux/man-pages/man3/clock_settime.3.html)

…vm#167405) The code for v16 of the shared cache objc class layout was copy/pasted from the previous versions incorrectly. Namely, the wrong class offset list was used and the class_infos index was never updated. rdar://164430695

…llvm#167379) According to the [spec](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc), it is illegal to addrspacecast to the generic AS, so use the function pointer AS for null constants. "It is illegal to use Function Pointer as 'Pointer' argument of OpPtrCastToGeneric." This was found when compiling the OpenMP Device RTL for SPIR-V. Signed-off-by: Nick Sarnie <[email protected]>

…llvm#163885) Co-authored-by: Nathan Huckleberry <[email protected]>

This patch adds `asin` to the entry points for Arm and AArch64. Tests have been run using Arm Toolchain for Embedded, a downstream toolchain.

…th optimizations possible (llvm#165464) the patch [Add strictfp attribute to prevent unwanted optimizations of libm calls](https://reviews.llvm.org/D34163) add `I.isStrictFP()` into ``` if (!I.isNoBuiltin() && !I.isStrictFP() && !F->hasLocalLinkage() && F->hasName() && LibInfo->getLibFunc(*F, Func) && LibInfo->hasOptimizedCodeGen(Func)) ``` it prevents the backend from optimizing even non-math libcalls such as `strlen` and `memcmp` if a call has the strict floating-point attribute. For example, it prevent converting strlen and memcmp to milicode call __strlen and __memcmp.

… Implement matching and inference and create clusters" (llvm#167559) Reverts llvm#165868 due to buildbot failures Co-authored-by: spupyrev <[email protected]>

The existing function is LT but most of the uses are better expressed as GE

Copy new process from sincospi.

Fixed stub relocation test. Just need to check 32-bit. --------- Co-authored-by: anoopkg6 <[email protected]>

…66883) parameters when defining the scripting interfaces. We try to count the parameters to make sure the user has defined them correctly, but this throws the counting off. I'm not adding a test for this because then it would seem like we thought this was a good idea. I'd actually rather not support it altogether, but we added the parameter checking pretty recently so there are extant implementations that we broke. I only want to support them, not suggest anyone else do this going forward.

…lvm#167534) These selects are dependent on values live into the CHRScope that we cannot infer anything about, so mark the branch weights unknown. These selects usually also just get folded down into a icmps, so the profile information ends up being kind of redundant.

…nds (llvm#165295) Reasoning behind proposed change. This helps us move away from selecting v_alignbits for fshr with uniform operands. V_ALIGNBIT is defined in the ISA as: D0.u32 = 32'U(({ S0.u32, S1.u32 } >> S2.u32[4 : 0]) & 0xffffffffLL) Note: S0 carries the MSBs and S1 carries the LSBs of the value being aligned. I interpret that as : concat (s0, s1) >> S2, and use the 0X1F mask to return the lower 32 bits. fshr: fshr i32 %src0, i32 %src1, i32 %src2 Where: concat(%src0, %src1) represents the 64-bit value formed by %src0 as the high 32 bits and %src1 as the low 32 bits. %src2 is the shift amount. Only the lower 32 bits are returned. So these two are identical. So, I can expand the V_ALIGNBIT through bit manipulation as: Concat: S1 | (S0 << 32) Shift: ((S1 | (S0 << 32)) >> S2) Break the shift: (S1>>S2) | (S0 << (32 – S2) The proposed pattern does exactly this. Additionally, src2 in the fshr pattern should be: * must be 0–31. * If the shift is ≥32, hardware semantics differ; you must handle it with extra instructions. The extra S_ANDs limit the selection only to the last 5 bits

…ast. NFC (llvm#167537)

The call-graph-section-assembly.ll tests in CodeGen/X86 and CodeGen/Aarch64 bot fail under LLVM_REVERSE_ITERATION. These sets should use SetVector to avoid non-determinism in the ouput.

…tTrunc (llvm#167165) Fixes llvm#165438 With `simd128` enabled, we may meet vector type truncation in FastISel. To respect llvm#138479, this patch merely bails out on non-integer IR types, though I prefer bailing out for all non-simple types as most targets (X86, AArch64) do.

We want the premerge advisor to write out comments, and we need the issue-write workflow to trigger on it in order for this to work. Landing this before the rest of llvm#166609 to enable testing that given this needs to be in repo due to permissions issues.

Add helper to make it easier to retrieve the single user of a VPUser.

…167221) `asm()` on function declarations is used for specifying the mangling. But that specific spelling is a GNU extension unlike `__asm()`. Found by building with `-std=c2y` in Clang's C frontend's config file.

Adjust the frame setup code for Windows ARM64 to attempt to align pair-wise spills to 16-byte boundaries. This enables us to properly emit the spills for custom clang calling convensions such as preserve most which spills r9-r15 which are normally nonvolatile registers. Even when using the ARM64EC opcodes for the unwinding, we cannot represent the spill if it is unaligned.

This hangs in expensive_checks

…lvm#166475) Allow widening up to 128-bit registers or if the new register class is at least as large as one of the existing register classes. This was artificially limiting. In particular this was doing the wrong thing with sequences involving copies between VGPRs and AV registers. Nearly all test changes are improvements. The coalescer does not just widen registers out of nowhere. If it's trying to "widen" a register, it's generally packing a register into an existing register tuple, or in a situation where the constraints imply the wider class anyway. 067a110 addressed the allocation failure concern by rejecting coalescing if there are no available registers. The original change in a4e63ea didn't include a realistic testcase to judge if this is harmful for pressure. I would expect any issues from this to be of garden variety subreg handling issue. We could use more dynamic state information here if it really is an issue. I get the best results by removing this override completely. This is a smaller step for patch splitting purposes.

`<stdbool.h>` is provided by the compiler and both Clang and GCC provide C++-aware versions of these headers, making our own wrapper header entirely unnecessary.

Classof for most recipes directly supports VPValue, so there is no need to call getDefiningRecipe when using isa/cast/dyn_cast.

Merge chasing latest versions of bulk test updates

When there are more than 255 sections, MachO object writer allows creation of object files which are potentially malformed. Currently, there are assertions in object writer code that prevents this behavior. But for distributions where assertions are turned off this still results in creation of malformed object files. Turning assertions into explicit errors.

…vm#167025) Ran into a use case where we had a MachO object file with a section symbol which did not have a section associated with it segfaults during linking. This patch aims to handle such cases gracefully and avoid the linker from crashing. --------- Co-authored-by: Ellis Hoag <[email protected]>

Windows doesn't support `pthread_attr`, which was introduced to asan_test.cpp in llvm#165198, so this change `#ifdef`s out the changes made in that PR. Originally reported by Chrome as https://crbug.com/459880605.

These should always use TargetConstant

Adds test coverage with loops where the same loads get executed under complementary predicates and can be hoisted, together with a set of negative test cases.

(llvm#159884) This eliminates the pseudo registerclasses used to hack the wave register class, which are now replaced with RegClassByHwMode, so most of the diff is from register class ID renumbering.

z1-cciauto · 2025-11-12T00:01:24Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2774

z1-cciauto · 2025-11-12T02:47:37Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2775

RKSimon and others added 30 commits November 11, 2025 17:14

[X86] Add 128-bit vector test coverage for llvm#167498 (llvm#167531)

e25c815

Revert "Reapply "Reapply "[mlir] Add FP software implementation lower…

e67ac07

…ing pass: `arith-to-apfloat` (llvm#166618)" (llvm#167431)"" (llvm#167549) Reverts llvm#167436 to fix sanitizers

[AMDGPU][GlobalISel] Add RegBankLegalize support for G_BLOCK_ADDR and…

5eb8d29

… G_GLOBAL_VALUE (llvm#165340)

[SystemZ] Use MCRegister instead of unsigned. NFC (llvm#167539)

2690f05

[ProfCheck] Mark Some profverify Tests as Unsupported (llvm#167544)

1a4c19d

These tests fail in the profcheck configuration because profinject gets added to the pipeline and adds metadata that changes the input PGO information.

[libc] Implement fchown (llvm#167286)

497dc10

Implements fchown fixes: llvm#166856

[gn] port f63d33d (clangOptions)

b7e35cc

[gn] port f63d33d more

4a81e42

[Clang] Consider reachability for file-scope warnings on initializers (…

cf6b443

…llvm#163885) Co-authored-by: Nathan Huckleberry <[email protected]>

[libc][math] Add asin to baremetal Arm and AArch64 (llvm#167339)

49dc49e

This patch adds `asin` to the entry points for Arm and AArch64. Tests have been run using Arm Toolchain for Embedded, a downstream toolchain.

[AMDGPU] Use MCRegister instead of unsigned. NFC (llvm#167558)

ee41ab3

Revert "Adding Matching and Inference Functionality to Propeller-PR4:…

1eaff19

… Implement matching and inference and create clusters" (llvm#167559) Reverts llvm#165868 due to buildbot failures Co-authored-by: spupyrev <[email protected]>

Triple: Add isMacOSVersionGE Triple utils (llvm#167450)

843f122

The existing function is LT but most of the uses are better expressed as GE

DAG: Use sincos vector libcalls through RuntimeLibcalls (llvm#166984)

de68181

Copy new process from sincospi.

Llvm jitlink build failure (llvm#167561)

93b71e6

Fixed stub relocation test. Just need to check 32-bit. --------- Co-authored-by: anoopkg6 <[email protected]>

[ARM][BPF][Lanai][MSP430] Use MCRegister::id() to avoid an implicit c…

62d3a1e

…ast. NFC (llvm#167537)

[AArch64] Use MCRegister instead of unsigned. NFC (llvm#167547)

7f06189

[llvm][asmprinter] Make call graph section deterministic (llvm#167400)

298c25a

The call-graph-section-assembly.ll tests in CodeGen/X86 and CodeGen/Aarch64 bot fail under LLVM_REVERSE_ITERATION. These sets should use SetVector to avoid non-determinism in the ouput.

XChy and others added 20 commits November 12, 2025 04:33

[VPlan] Add getSingleUser helper (NFC).

c41ef17

Add helper to make it easier to retrieve the single user of a VPUser.

PPC: Disable type checking in xfailed sincospi test (llvm#167563)

d6c750b

This hangs in expensive_checks

[libc++] Remove <stdbool.h> (llvm#164595)

5c3323a

`<stdbool.h>` is provided by the compiler and both Clang and GCC provide C++-aware versions of these headers, making our own wrapper header entirely unnecessary.

[SPIRV] Use MCRegister instead of unsigned. NFC (llvm#167585)

47a3ea4

[VPlan] Remove unneeded getDefiningRecipe with isa/cast/dyn_cast. (NFC)

519cf3c

Classof for most recipes directly supports VPValue, so there is no need to call getDefiningRecipe when using isa/cast/dyn_cast.

[CodeGen] Use MCRegUnit in more places (NFC) (llvm#167578)

1067930

AMDGPU: Regenerate test checks after bbde792 (llvm#167590)

2308d16

Merge chasing latest versions of bulk test updates

[compiler-rt][asan] Fix a test on Windows (llvm#167591)

dc0ccbd

Windows doesn't support `pthread_attr`, which was introduced to asan_test.cpp in llvm#165198, so this change `#ifdef`s out the changes made in that PR. Originally reported by Chrome as https://crbug.com/459880605.

AArch64: Use TargetConstant for intrinsic IDs (llvm#166661)

bbf62dc

These should always use TargetConstant

[VPlan] Add tests for hoisting predicated loads.

b612b10

Adds test coverage with loops where the same loads get executed under complementary predicates and can be hoisted, together with a set of negative test cases.

workflows/libclang-abi-tests: Use new container (llvm#167459)

196ea57

AMDGPU: Start using RegClassByHwMode for wavesize operands

2bf9278

(llvm#159884) This eliminates the pseudo registerclasses used to hack the wave register class, which are now replaced with RegClassByHwMode, so most of the diff is from register class ID renumbering.

merge main into amd-staging

1ae833f

ronlieb requested review from a team and dpalermo November 11, 2025 23:59

ronlieb requested review from Groverkss and nicolasvasilache as code owners November 11, 2025 23:59

dpalermo approved these changes Nov 12, 2025

View reviewed changes

mar unstable llvm/test/MC/MachO/invalid-section-index.s

34a4763

z1-cciauto merged commit c1f09d1 into amd-staging Nov 12, 2025
15 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20251111173022 branch November 12, 2025 05:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #567

merge main into amd-staging #567

Uh oh!

ronlieb commented Nov 11, 2025

Uh oh!

z1-cciauto commented Nov 12, 2025

Uh oh!

z1-cciauto commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

46 participants

merge main into amd-staging #567

merge main into amd-staging #567

Uh oh!

Conversation

ronlieb commented Nov 11, 2025

Uh oh!

z1-cciauto commented Nov 12, 2025

Uh oh!

z1-cciauto commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

46 participants