merge main into amd-staging #575

ronlieb · 2025-11-12T23:46:21Z

No description provided.

In order to have MVE support, the same bits of the CPACR register that enable the floating-point extension must be set.

…tions. (llvm#164682) This change aims to convert vector loads to scalar loads, if they are only converted to scalars after anyway. alive2 proof: https://alive2.llvm.org/ce/z/U_rvht

Fixes minor spelling issues in the shape dialect's passes tablegen file.

As per the wording from 5.2, the COLLAPSE clause applies once to the entire construct. The 6.0 spec has a somewhat similar wording with the same intent. In practice, apply the clause to the innermost leaf constituent that allows it, without requiring it to be the exact innermost leaf.

Added support for conditional operators in the lifetime safety analysis. Added a `VisitConditionalOperator` method to the `FactsGenerator` class to handle the ternary operator (`?:`) in lifetime safety analysis. Fixes llvm#157108

Upstream supporting Load/Store ops for Complex with volatile qualifier

Add a missing description.

The subtarget may not be set if no functions are present in the module. Attempt to use the TargetMachine directly in more cases. Fixes llvm#165422 Fixes llvm#167577

llvm#165946) When building LLVM and Clang on Windows with plugin support enabled, some symbols are redundantly exported due to template instantiations and lambda functions. These symbols are not needed in the importing translation units and can be safely removed. In the meantime, the global variables and static data members are needed for correct linking and runtime behavior, so they are added to the export list. Also, the `llvm::<Class>::dump()` and `clang::<Class>::dump()` methods are not needed for linking in importing translation units, because they are only available in debug builds and should be only used for debugging purposes. Therefore, these methods are removed from the export list.

…vm#167645)

…m#167648)

… MapsForPrivatizedSymbolsPass (llvm#167554) The descriptors of a variable that has been privatized should be mapped `tofrom` instead of `to`.

…tTriple().isSPIRV() ` with `targetSupportsBF16Type(MF)` (llvm#167704)

…llvm#152738) This PR adds hardware-measured latencies for all instructions defined in Section 16 of the RVV specification: "Vector Permutation Instructions" to the SpacemiT-X60 scheduling model. --------- Signed-off-by: Mikhail R. Gadelha <[email protected]>

…#166933)

…m#166668) Including unistd.h does not expose fileno() on Newlib.

Store the list of errors in the ConsstructDecomposition class in addition to the broken up output. This not used in flang yet, because the splitting happens at a time when diagnostic messages can no longer be emitted. Use unit tests to test this instead.

Our driver warns under various circumstances if SDK directories can't be found. This warning is not applicable in ThinLTO codegen mode (-fthinlto-index=). Suppress it. The motivation for doing this is that we sometimes see this warning emitted when DTLTO invokes the compiler on a remote machine to do the LTO backend compilations (with -fthinlto-index=). Internal Ref: TOOLCHAIN-20592

Follow up on c2d4c7c ([VPlan] Permit more users in narrowToSingleScalars) to fix an assert related to WidenStore users of the recipe being narrowed in narrowToSingleScalars.

Previously, we had 2 level of attributes: - HLSLUnparsedSemantic - N attributes, one for each known system semantic. The first was assigned during parsing, and carried no other meaning than "there is a semantic token". It was then converted to one of the N attributes later during Sema. Those attributes also carried informations like "is indexable" or "is index explicit". This had a few issues: - there was no difference between a semantic attribute applied to a decl, and the effective semantic in the entrypoint use context. - having the indexable bit was not useful. - semantic constraints checks were split between .td files and sema. Also, existing implementation had effective attributes attached to the type decl or parameters, meaning struct decl reuse across entrypoints of in a nested type was not supported, even if legal in HLSL. This PR tried to simplifies semantic attribute by having 3 attributes: - HLSLUnpasedSemantic - HLSLParsedSemantic - HLSLAppliedSemantic Initial parsing emits an `HLSLUnparsedSemantic`. We simply say "here is an HLSL semantic token", but we don't do any semantic check. Then, Sema does initial validation and transforms an UnparseSemantic into a ParsedSemantic. This validates a system semantic is known, or that the associated type is valid (like uint3 for a ThreadIndex). Then, once we parse an actual shader entrypoint, we can know how semantics are used in a real context. This step emits a list of AppliedSemantic. Those are the actual semantic in use for this specific entrypoint. Those attributes are attached to each entrypoint parameter, as a flat list matching the semantic structure flattening HLSL defines. At this stage of sema, index collision or other stage compabitility checkes are carried. This allows codegen to simply iterate over this list and emit the proper DXIL or SPIR-V codegen.

…am (llvm#167724) This got exposed by `09262656f32ab3f2e1d82e5342ba37eecac52522`. The underlying stream of `m_os` is referenced by the `TextDiagnostic` member of `TextDiagnosticPrinter`. It got turned into a `llvm::formatted_raw_ostream` in the commit above. When `~TextDiagnosticPrinter` (and thus `~TextDiagnostic`) is invoked, we now call `~formatted_raw_ostream`, which tries to access the underlying stream. But `m_os` was already deleted because it is earlier in the order of destruction in `TextDiagnosticPrinter`. Move the `m_os` member before the `TextDiagnosticPrinter` to avoid a use-after-free. Drive-by: * Also move the `m_output` member which the `m_os` holds a reference to. The fact it's a reference indicates the expectation is most likely that the string outlives the stream. The ASAN macOS bot is currently failing with this: ``` 08:15:39 ================================================================= 08:15:39 ==61103==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600012cf40 at pc 0x00012140d304 bp 0x00016eecc850 sp 0x00016eecc848 08:15:39 READ of size 8 at 0x60600012cf40 thread T0 08:15:39 #0 0x00012140d300 in llvm::formatted_raw_ostream::releaseStream() FormattedStream.h:205 08:15:39 #1 0x00012140d3a4 in llvm::formatted_raw_ostream::~formatted_raw_ostream() FormattedStream.h:145 08:15:39 #2 0x00012604abf8 in clang::TextDiagnostic::~TextDiagnostic() TextDiagnostic.cpp:721 08:15:39 #3 0x00012605dc80 in clang::TextDiagnosticPrinter::~TextDiagnosticPrinter() TextDiagnosticPrinter.cpp:30 08:15:39 #4 0x00012605dd5c in clang::TextDiagnosticPrinter::~TextDiagnosticPrinter() TextDiagnosticPrinter.cpp:27 08:15:39 #5 0x0001231fb210 in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 #6 0x0001231fb3bc in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 #7 0x000129aa9d70 in clang::DiagnosticsEngine::~DiagnosticsEngine() Diagnostic.cpp:91 08:15:39 #8 0x0001230436b8 in llvm::RefCountedBase<clang::DiagnosticsEngine>::Release() const IntrusiveRefCntPtr.h:103 08:15:39 #9 0x0001231fe6c8 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 08:15:39 #10 0x0001231fe858 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 ... 08:15:39 08:15:39 0x60600012cf40 is located 32 bytes inside of 56-byte region [0x60600012cf20,0x60600012cf58) 08:15:39 freed by thread T0 here: 08:15:39 #0 0x0001018abb88 in _ZdlPv+0x74 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4bb88) 08:15:39 #1 0x0001231fb1c0 in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 #2 0x0001231fb3bc in (anonymous namespace)::StoringDiagnosticConsumer::~StoringDiagnosticConsumer() ClangModulesDeclVendor.cpp:47 08:15:39 #3 0x000129aa9d70 in clang::DiagnosticsEngine::~DiagnosticsEngine() Diagnostic.cpp:91 08:15:39 #4 0x0001230436b8 in llvm::RefCountedBase<clang::DiagnosticsEngine>::Release() const IntrusiveRefCntPtr.h:103 08:15:39 #5 0x0001231fe6c8 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 08:15:39 #6 0x0001231fe858 in (anonymous namespace)::ClangModulesDeclVendorImpl::~ClangModulesDeclVendorImpl() ClangModulesDeclVendor.cpp:93 ... 08:15:39 08:15:39 previously allocated by thread T0 here: 08:15:39 #0 0x0001018ab760 in _Znwm+0x74 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x4b760) 08:15:39 #1 0x0001231f8dec in lldb_private::ClangModulesDeclVendor::Create(lldb_private::Target&) ClangModulesDeclVendor.cpp:732 08:15:39 #2 0x00012320af58 in lldb_private::ClangPersistentVariables::GetClangModulesDeclVendor() ClangPersistentVariables.cpp:124 08:15:39 #3 0x0001232111f0 in lldb_private::ClangUserExpression::PrepareForParsing(lldb_private::DiagnosticManager&, lldb_private::ExecutionContext&, bool) ClangUserExpression.cpp:536 08:15:39 #4 0x000123213790 in lldb_private::ClangUserExpression::Parse(lldb_private::DiagnosticManager&, lldb_private::ExecutionContext&, lldb_private::ExecutionPolicy, bool, bool) ClangUserExpression.cpp:647 08:15:39 #5 0x00012032b258 in lldb_private::UserExpression::Evaluate(lldb_private::ExecutionContext&, lldb_private::EvaluateExpressionOptions const&, llvm::StringRef, llvm::StringRef, std::__1::shared_ptr<lldb_private::ValueObject>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, lldb_private::ValueObject*) UserExpression.cpp:280 08:15:39 #6 0x000120724010 in lldb_private::Target::EvaluateExpression(llvm::StringRef, lldb_private::ExecutionContextScope*, std::__1::shared_ptr<lldb_private::ValueObject>&, lldb_private::EvaluateExpressionOptions const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, lldb_private::ValueObject*) Target.cpp:2905 08:15:39 #7 0x00011fc7bde0 in lldb::SBTarget::EvaluateExpression(char const*, lldb::SBExpressionOptions const&) SBTarget.cpp:2305 08:15:39 ==61103==ABORTING ... ```

…run for any region. (llvm#162025) During init of unclustered schedule stage, minOccupancy may be temporarily increased. But subsequently, if none of the regions are scheduled because they don't meet the conditions of initGCNRegion, minOccupancy remains incorrectly set. This patch avoids this incorrectness by delaying the change of minOccupancy until a region is about to be scheduled.

This adds handling for null base class initialization, but only for the trivial case where the class is empty. This also moves emitCXXConstructExpr to CIRGenExprCXX.cpp for consistency with classic codegen and the incubator repo.

…Dimensions from Affine Maps (llvm#167587) This PR exposes `linalg::inferContractionDims(ArrayRef<AffineMap>)` to Python, allowing users to infer contraction dimensions (batch/m/n/k) directly from a list of affine maps without needing an operation. --------- Signed-off-by: Bangtian Liu <[email protected]>

- Dangling pointer (from std::string) is created and trigger crash on some Linux distributions under different build types.

No tests modified as there are none that explicitly stop at DynAllocaExpander, and we do not have enough of a pipeline to run those yet anyways. Reviewers: phoebewang, RKSimon, paperchalice, arsenm Reviewed By: arsenm Pull Request: llvm#167740

Test changes are mostly noise. There are a few improvements and a few regressions.

`llvm::TypeSize` uses 64bit integers, so we should cast the `recordSize` before multiplying by 8 to prevent an overflow.

…lvm#167682)

Now that the caching seems to be working reasonably well, enable building and testing the entirety of the project to actually catch most of the build configuration issues that this workflow is intended to catch.

After my previous change (llvm#167579), the string exceeded 16380 single-byte characters. MSVC did not like this, so I'm splitting it up into two strings.

Enables the terminal rule for remaining targets

Add the following `FEAT_MOPS_GO` instructions: * `SETGOP`, `SETGOM`, `SETGOE` * `SETGOPN`, `SETGOMN`, `SETGOEN` * `SETGOPT`, `SETGOMT`, `SETGOET` * `SETGOPTN`, `SETGOMTN`, `SETGOETN` as blogged about here: * https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/future-architecture-technologies-poe2-and-vmte and as documented here: * https://developer.arm.com/documentation/109697/2025_09/Future-Architecture-Technologies

…be directly evaluated into the destination even when it might alias the source (llvm#167344) Evaluate all aggregate compound literals into a temporary and then copy it to the destination if aliasing is possible. This fixes a latent issue exposed by llvm#154490, where evaluating the RHS directly into the destination could ignore potential aliasing. rdar://164094548

…lvm#167759) Reverting 2 commits from the mainline. The origin of the issue, and the tentative fix-forward.

This adds handling in CIR's ScalarExprEmitter for CK_DerivedToBase cast expressions.

This PR adds support for emitting the promise declaration in coroutines and obtaining the `get_return_object()`.

…lvm#166213) This is in preparation for future changes in AMDGPU that will make more substantial use of bundles pre-RA. For now, simply test this with degenerate (single-instruction) bundles.

xfails: needs pranav to look into flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 flang/test/Lower/OpenMP/optional-argument-map-2.f90

z1-cciauto · 2025-11-12T23:47:54Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2787

This patch reenables tests that had been xfailed in a previous merge (#575) by Ron. 1. test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 Fixed test to accommodate map_clauses related to descriptors that we do not have upstream. 2. test/Lower/OpenMP/optional-argument-map-2.f90 Same problem as above in addition to a bad merge that enabled some testing that had been deliberately disabled by Andrew on amd-staging in the past. See commit 706196c

vhscampos and others added 30 commits November 12, 2025 15:33

[libc] Add support for MVE to Arm startup code (llvm#167338)

5932477

In order to have MVE support, the same bits of the CPACR register that enable the floating-point extension must be set.

[VectorCombine] Try to scalarize vector loads feeding bitcast instruc…

8280070

…tions. (llvm#164682) This change aims to convert vector loads to scalar loads, if they are only converted to scalars after anyway. alive2 proof: https://alive2.llvm.org/ce/z/U_rvht

[MLIR][NFC] Fix minor spelling issues (llvm#167606)

d4b43f0

Fixes minor spelling issues in the shape dialect's passes tablegen file.

[CIR] Upstream Load/Store Complex with volatile qualifier (llvm#167216)

f240a73

Upstream supporting Load/Store ops for Complex with volatile qualifier

[mlir][tosa] Add missing ext-mxfp description (llvm#167665)

ec085e5

Add a missing description.

[ARM] Use TargetMachine over Subtarget in ARMAsmPrinter (llvm#166329)

4d1f249

The subtarget may not be set if no functions are present in the module. Attempt to use the TargetMachine directly in more cases. Fixes llvm#165422 Fixes llvm#167577

[Mips] Remove implicit conversions of MCRegister to unsigned. NFC (ll…

0845c5a

…vm#167645)

[X86] Remove implicit conversions of MCRegister to unsigned. NFC (llv…

d4847f7

…m#167648)

[Flang][OpenMP] - Fix the mapping flags used on descriptors mapped by…

33a352f

… MapsForPrivatizedSymbolsPass (llvm#167554) The descriptors of a variable that has been privatized should be mapped `tofrom` instead of `to`.

[NFC][SPIRV][IRTranslator] Replace leftover `MF->getTarget().getTarge…

5b56816

…tTriple().isSPIRV() ` with `targetSupportsBF16Type(MF)` (llvm#167704)

Remove unused standard headers: memory, unordered_* (llvm#167297)

0bbf644

[RISCV] Expand multiplication by 2^N * 3/5/9 + 1 with SHL_ADD (llvm…

ca72e8d

…#166933)

[libc++] Guard fileno() and isatty() usage correctly for Newlib. (llv…

a3058d5

…m#166668) Including unistd.h does not expose fileno() on Newlib.

[PowerPC] Add intrinsic support for xvrlw (llvm#167349)

c0ac0c4

[CodeGen] Use MCRegUnit in two more TRI methods (NFC) (llvm#167680)

905c7aa

[VPlan] Fix assert in store-user in narrowToSingleScalars (llvm#167686)

9ba738a

Follow up on c2d4c7c ([VPlan] Permit more users in narrowToSingleScalars) to fix an assert related to WidenStore users of the recipe being narrowed in narrowToSingleScalars.

[Docs] Fix typo in vp.load.ff intrinsic documentation. NFC (llvm#167721)

830f690

DAG: Fix assert on nofpclass call with aggregate return (llvm#167725)

24be0ba

[X86] Remove Redundant Default Destructor

4d772de

philnik777 and others added 19 commits November 12, 2025 20:38

[libc++] Simplify the implementation of aligned_storage (llvm#162459)

43ca08d

DAG: exp opcodes cannotBeOrderedNegativeFP (llvm#167604)

0385a18

[clang][HLSL] Fix crash issue due to Twine usage

cc54ee8

- Dangling pointer (from std::string) is created and trigger crash on some Linux distributions under different build types.

DAG: Use poison when widening build_vector (llvm#167631)

782759b

Test changes are mostly noise. There are a few improvements and a few regressions.

[libunwind] Fix build error because of wrong register size (llvm#167743)

e5e9c3b

[CIR] Cast record size to uint64 to prevent overflow (llvm#167525)

a799a8e

`llvm::TypeSize` uses 64bit integers, so we should cast the `recordSize` before multiplying by 8 to prevent an overflow.

[AsmPrinter] Replace improper use of Register with MCRegUnit (NFC) (l…

47cef55

…lvm#167682)

[Github] Make bazel workflow run all tests (llvm#167576)

919bff7

Now that the caching seems to be working reasonably well, enable building and testing the entirety of the project to actually catch most of the build configuration issues that this workflow is intended to catch.

[lldb] Split up shared cache objc metadata extractor body (llvm#167761)

6806349

After my previous change (llvm#167579), the string exceeded 16380 single-byte characters. MSVC did not like this, so I'm splitting it up into two strings.

CodeGen: Remove target hook for terminal rule (llvm#165962)

dfdada1

Enables the terminal rule for remaining targets

Revert "[HLSL] Rework semantic handling as attributes llvm#166796" (l…

1d2429b

…lvm#167759) Reverting 2 commits from the mainline. The origin of the issue, and the tentative fix-forward.

[CIR] Handle scalar DerivedToBase cast expressions (llvm#167370)

260df80

This adds handling in CIR's ScalarExprEmitter for CK_DerivedToBase cast expressions.

[CIR] Emit promise declaration in coroutine (llvm#166683)

cf9cb54

This PR adds support for emitting the promise declaration in coroutines and obtaining the `get_return_object()`.

[flang][OpenMP] Delete include of unused header, NFC (llvm#167762)

66da12a

CodeGen/AMDGPU: Allow 3-address conversion of bundled instructions (l…

6636659

…lvm#166213) This is in preparation for future changes in AMDGPU that will make more substantial use of bundles pre-RA. For now, simply test this with degenerate (single-instruction) bundles.

merge main into amd-staging

f4093c8

xfails: needs pranav to look into flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 flang/test/Lower/OpenMP/optional-argument-map-2.f90

ronlieb requested review from a team and dpalermo November 12, 2025 23:46

ronlieb requested review from nicolasvasilache and stellaraccident as code owners November 12, 2025 23:46

dpalermo approved these changes Nov 13, 2025

View reviewed changes

z1-cciauto merged commit 56c1a58 into amd-staging Nov 13, 2025
16 checks passed

z1-cciauto deleted the amd/merge/upstream_merge_20251112160503 branch November 13, 2025 02:41

bhandarkar-pranav mentioned this pull request Nov 13, 2025

[Flang][NFC] Un-xfail tests that had to be xfailed in f4093c8 #583

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #575

merge main into amd-staging #575

Uh oh!

ronlieb commented Nov 12, 2025

Uh oh!

z1-cciauto commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

44 participants

merge main into amd-staging #575

merge main into amd-staging #575

Uh oh!

Conversation

ronlieb commented Nov 12, 2025

Uh oh!

z1-cciauto commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

44 participants