forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 75
merge main into amd-staging #567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
z1-cciauto
merged 65 commits into
amd-staging
from
amd/merge/upstream_merge_20251111173022
Nov 12, 2025
Merged
merge main into amd-staging #567
z1-cciauto
merged 65 commits into
amd-staging
from
amd/merge/upstream_merge_20251111173022
Nov 12, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…nt matching and inference and create clusters (llvm#165868) Adding Matching and Inference Functionality to Propeller. For detailed information, please refer to the following RFC: https://discourse.llvm.org/t/rfc-adding-matching-and-inference-functionality-to-propeller/86238. This is the fourth PR, which is used to implement matching and inference and create the clusters. The associated PRs are: PR1: llvm#160706 PR2: llvm#162963 PR3: llvm#164223 co-authors: lifengxiang1025 [[email protected]](mailto:[email protected]); zcfh [[email protected]](mailto:[email protected]) Co-authored-by: lifengxiang1025 <[email protected]> Co-authored-by: zcfh <[email protected]>
This patch adds a new `FramePointerKind::NonLeafNoReserve` and makes it the default for `-momit-leaf-frame-pointer`. It also adds a new commandline option `-m[no-]reserve-frame-pointer-reg`. This should fix llvm#154379, the main impact of this patch can be found in `clang/lib/Driver/ToolChains/CommonArgs.cpp`.
…ing pass: `arith-to-apfloat` (llvm#166618)" (llvm#167431)"" (llvm#167549) Reverts llvm#167436 to fix sanitizers
llvm#165198) Asan test `ThreadedStressStackReuseTest ` fails on AIX due to smaller default thread stack size. Set thread stack size to a minimum of 128KB to ensure reliable test behavior across platforms (platforms with smaller default thread stack size). --------- Co-authored-by: Riyaz Ahmad <[email protected]>
Simplify `createReadOrMaskedRead` to only require _one_ argument to specify the vector type to read (passed as `VectorType`) instead of passing vector-sizes and scalable-flags independently (i.e. _two_ arguments). A simple overload is provided for users that wouldn't re-use the corresponding `VectorType` (and hence there's no point for them to create). While there are no users upstream for this overload, it's been helpful downstream.
These tests fail in the profcheck configuration because profinject gets added to the pipeline and adds metadata that changes the input PGO information.
…vm#166901) Now that llvm#166517 has landed and [Writer](https://github.com/llvm/llvm-project/blob/main/libc/src/stdio/printf_core/writer.h#L130) has been refactored to track bytes written as size_t, strftime can be refactored as well to handle size_t return values. Can't think of a proper way to test this without creating a 2GB+ string, but existing tests cover most cases.
Implements fchown fixes: llvm#166856
Closes llvm#161461 - This is my first time contributing to libc's POSIX, so for reference I used `clock_gettime` implementation for Linux. For convenience, here is the description of `clock_settime` function [behavior](https://www.man7.org/linux/man-pages/man3/clock_settime.3.html)
…vm#167405) The code for v16 of the shared cache objc class layout was copy/pasted from the previous versions incorrectly. Namely, the wrong class offset list was used and the class_infos index was never updated. rdar://164430695
…llvm#167379) According to the [spec](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc), it is illegal to addrspacecast to the generic AS, so use the function pointer AS for null constants. "It is illegal to use Function Pointer as 'Pointer' argument of OpPtrCastToGeneric." This was found when compiling the OpenMP Device RTL for SPIR-V. Signed-off-by: Nick Sarnie <[email protected]>
…llvm#163885) Co-authored-by: Nathan Huckleberry <[email protected]>
This patch adds `asin` to the entry points for Arm and AArch64. Tests have been run using Arm Toolchain for Embedded, a downstream toolchain.
…th optimizations possible (llvm#165464) the patch [Add strictfp attribute to prevent unwanted optimizations of libm calls](https://reviews.llvm.org/D34163) add `I.isStrictFP()` into ``` if (!I.isNoBuiltin() && !I.isStrictFP() && !F->hasLocalLinkage() && F->hasName() && LibInfo->getLibFunc(*F, Func) && LibInfo->hasOptimizedCodeGen(Func)) ``` it prevents the backend from optimizing even non-math libcalls such as `strlen` and `memcmp` if a call has the strict floating-point attribute. For example, it prevent converting strlen and memcmp to milicode call __strlen and __memcmp.
… Implement matching and inference and create clusters" (llvm#167559) Reverts llvm#165868 due to buildbot failures Co-authored-by: spupyrev <[email protected]>
The existing function is LT but most of the uses are better expressed as GE
Copy new process from sincospi.
Fixed stub relocation test. Just need to check 32-bit. --------- Co-authored-by: anoopkg6 <[email protected]>
…66883) parameters when defining the scripting interfaces. We try to count the parameters to make sure the user has defined them correctly, but this throws the counting off. I'm not adding a test for this because then it would seem like we thought this was a good idea. I'd actually rather not support it altogether, but we added the parameter checking pretty recently so there are extant implementations that we broke. I only want to support them, not suggest anyone else do this going forward.
…lvm#167534) These selects are dependent on values live into the CHRScope that we cannot infer anything about, so mark the branch weights unknown. These selects usually also just get folded down into a icmps, so the profile information ends up being kind of redundant.
…nds (llvm#165295) Reasoning behind proposed change. This helps us move away from selecting v_alignbits for fshr with uniform operands. V_ALIGNBIT is defined in the ISA as: D0.u32 = 32'U(({ S0.u32, S1.u32 } >> S2.u32[4 : 0]) & 0xffffffffLL) Note: S0 carries the MSBs and S1 carries the LSBs of the value being aligned. I interpret that as : concat (s0, s1) >> S2, and use the 0X1F mask to return the lower 32 bits. fshr: fshr i32 %src0, i32 %src1, i32 %src2 Where: concat(%src0, %src1) represents the 64-bit value formed by %src0 as the high 32 bits and %src1 as the low 32 bits. %src2 is the shift amount. Only the lower 32 bits are returned. So these two are identical. So, I can expand the V_ALIGNBIT through bit manipulation as: Concat: S1 | (S0 << 32) Shift: ((S1 | (S0 << 32)) >> S2) Break the shift: (S1>>S2) | (S0 << (32 – S2) The proposed pattern does exactly this. Additionally, src2 in the fshr pattern should be: * must be 0–31. * If the shift is ≥32, hardware semantics differ; you must handle it with extra instructions. The extra S_ANDs limit the selection only to the last 5 bits
The call-graph-section-assembly.ll tests in CodeGen/X86 and CodeGen/Aarch64 bot fail under LLVM_REVERSE_ITERATION. These sets should use SetVector to avoid non-determinism in the ouput.
…tTrunc (llvm#167165) Fixes llvm#165438 With `simd128` enabled, we may meet vector type truncation in FastISel. To respect llvm#138479, this patch merely bails out on non-integer IR types, though I prefer bailing out for all non-simple types as most targets (X86, AArch64) do.
We want the premerge advisor to write out comments, and we need the issue-write workflow to trigger on it in order for this to work. Landing this before the rest of llvm#166609 to enable testing that given this needs to be in repo due to permissions issues.
Add helper to make it easier to retrieve the single user of a VPUser.
…167221) `asm()` on function declarations is used for specifying the mangling. But that specific spelling is a GNU extension unlike `__asm()`. Found by building with `-std=c2y` in Clang's C frontend's config file.
Adjust the frame setup code for Windows ARM64 to attempt to align pair-wise spills to 16-byte boundaries. This enables us to properly emit the spills for custom clang calling convensions such as preserve most which spills r9-r15 which are normally nonvolatile registers. Even when using the ARM64EC opcodes for the unwinding, we cannot represent the spill if it is unaligned.
This hangs in expensive_checks
…lvm#166475) Allow widening up to 128-bit registers or if the new register class is at least as large as one of the existing register classes. This was artificially limiting. In particular this was doing the wrong thing with sequences involving copies between VGPRs and AV registers. Nearly all test changes are improvements. The coalescer does not just widen registers out of nowhere. If it's trying to "widen" a register, it's generally packing a register into an existing register tuple, or in a situation where the constraints imply the wider class anyway. 067a110 addressed the allocation failure concern by rejecting coalescing if there are no available registers. The original change in a4e63ea didn't include a realistic testcase to judge if this is harmful for pressure. I would expect any issues from this to be of garden variety subreg handling issue. We could use more dynamic state information here if it really is an issue. I get the best results by removing this override completely. This is a smaller step for patch splitting purposes.
`<stdbool.h>` is provided by the compiler and both Clang and GCC provide C++-aware versions of these headers, making our own wrapper header entirely unnecessary.
Classof for most recipes directly supports VPValue, so there is no need to call getDefiningRecipe when using isa/cast/dyn_cast.
Merge chasing latest versions of bulk test updates
When there are more than 255 sections, MachO object writer allows creation of object files which are potentially malformed. Currently, there are assertions in object writer code that prevents this behavior. But for distributions where assertions are turned off this still results in creation of malformed object files. Turning assertions into explicit errors.
…vm#167025) Ran into a use case where we had a MachO object file with a section symbol which did not have a section associated with it segfaults during linking. This patch aims to handle such cases gracefully and avoid the linker from crashing. --------- Co-authored-by: Ellis Hoag <[email protected]>
Windows doesn't support `pthread_attr`, which was introduced to asan_test.cpp in llvm#165198, so this change `#ifdef`s out the changes made in that PR. Originally reported by Chrome as https://crbug.com/459880605.
These should always use TargetConstant
Adds test coverage with loops where the same loads get executed under complementary predicates and can be hoisted, together with a set of negative test cases.
(llvm#159884) This eliminates the pseudo registerclasses used to hack the wave register class, which are now replaced with RegClassByHwMode, so most of the diff is from register class ID renumbering.
Collaborator
dpalermo
approved these changes
Nov 12, 2025
Collaborator
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.