merge main into amd-staging #812

z1-cciauto · 2025-12-10T23:55:42Z

No description provided.

These fallback layouts are essentially guesses. Used when there is no other way to query register information from the debug server. Therefore there is a risk that LLDB and the debug server disagree, which can produce strange effects. I have added a log message here so we have a clue when triaging these problems. Note that it's not wrong to assume a layout in some situations. It's how some debug servers were built. However if you end up using the fallback when the server expected you to use XML, you're likely going to have a bad time.

To start using the more recently built containers.

…0559) Add support for specifying the names of address spaces when specifying pointer properties for an address space. Update LLVM's AsmPrinter and LLParser to print and read these symbolic address space name.

This fixes missed suffix `Op` of `CIR_AtomicFence` defination and also improves API `makeAtomicFenceValue`.

Add parsing and semantic checks for DIMS modifier on NUM_TEAMS, NUM_THREADS, and THREAD_LIMIT.

Add core abstractions for identifying program entities across compilation and link unit boundaries in the Scalable Static Analysis Framework (SSAF). Introduces three key components: - BuildNamespace: Represents build artifacts (compilation units, link units) - EntityName: Globally unique entity identifiers across compilation boundaries - AST mapping: Functions to map Clang AST declarations to EntityNames Entity identification uses Unified Symbol Resolution (USR) as the underlying mechanism, with extensions for sub-entities (parameters, return values) via suffixes. The abstraction allows whole-program analysis by providing stable identifiers that persist across separately compiled translation units.

Patch disables memory intrinsics expansion, enabled by default in llvm#168622. This patch does the same in clang, but not in flang. The expansion causes massive perf regressions, up to 2x times in fortran code. Reviewers: jeanPerier, vzakhari Reviewed By: vzakhari Pull Request: llvm#171650

…ysis (llvm#168903) This patch introduces a new TargetTransformInfo hook `getInstructionUniformity()` that provides a unified interface for querying target-specific uniformity information about instructions and values. The new hook returns an `InstructionUniformity` enum with three values: - Default: Result is uniform if all operands are uniform (standard propagation) - AlwaysUniform: Result is always uniform regardless of operands - NeverUniform: Result can never be assumed uniform This API wraps the existing `isAlwaysUniform()` and `isSourceOfDivergence()` hooks, providing a single entry point for uniformity queries. Both LLVM IR-level (via TTI) and MIR-level (via TargetInstrInfo) uniformity analysis have been updated to use the new hook. Target implementations: - AMDGPU: Wraps existing `isAlwaysUniform()` and `isSourceOfDivergence()` hooks - NVPTX: Wraps existing `isSourceOfDivergence()` hook This is an NFC change - all implementations return conservative defaults or wrap existing functionality. Ref patch:llvm#137639

Bumps the runner version to keep things up to date (and prevent us from falling below the support horizon).

…Graph.cpp (NFC)

AppendError ends up trimming this "\n" from the end of the string, then putting another on on. So there's no reason to keep appending the newline in CommandObjectMultiword::Execute.

…MT distribution with partial offsets. (llvm#171512) `vector.extract_strided_slice` can have two forms when specifying offsets. Case 1: ``` %1 = vector.extract_strided_slice %0 { offsets = [8, 0], sizes = [8, 16], strides = [1, 1]} : vector<24x16xf32> to vector<8x16xf32> ``` Case 2: ``` %1 = vector.extract_strided_slice %0 { offsets = [8], sizes = [8], strides = [1]} : vector<24x16xf32> to vector<8x16xf32> ``` These two ops means the same thing, but case 2 is syntactic sugar to avoid specifying offsets for fully extracted dims. Currently case 2 fails in XeGPU SIMT distribution. This PR fixes this issue.

If a veneer is not disassembled in lite mode, the veneer elimination pass will not recognize it as such and the call to such veneer will remain unchanged. Later, we may need to insert a new veneer for such code ending up with a double veneer. To avoid such suboptimal code generation, always disassemble veneers and guarantee that they are converted to direct calls in BOLT.

This seems the standard way to get the path to such tools within LLVM. Calling findBuiltClang() has some annoying behavior like falling back to CC when it cannot find anything else, which might point to anything or not even be set. We noticed this with our internal build system as the lli binary is not in the same path as the clang binary.

…n popen.cpp (llvm#171622) This test has begun failing on iossim with 'sh: sort: command not found' in the stderr. I believe this may be due to the change to the lit internal shell not having 'sort' in it's path. This patch adds the full path /usr/bin/sort to work around this.

…n NVVMDialect.cpp (NFC)

…m#171579) Assign output sections for injected functions explicitly, and don't reassign in AssignSections pass. This change is a prerequisite for further PRs where veneer functions are created as injected functions and their code section depends on their placement.

…#171639) Besides simplifying the code, this refactor should also make it more efficient: instead of using the `hasAnySubstatement` matcher to find blocks we're interested in, which requires looking through every substatement, this PR introduces a custom `hasFinalStmt` matcher which only checks the last substatement.

This PR adds support for treating WTF::move like std::move in various WebKit checkers.

Add support for the ExpressionTraitExpr

llvm#171647) …pt (llvm#169559)" This reverts commit 4da31b6.

This obsoletes the FIXME in llvm#85686, but it doesn't address the issue where moves from CCR will still be emitted on 68000. However, all such moves will now be emitted as physreg copies, and the issue can thus be handled there in a followup change.

/usr/bin/ld: tools/clang/unittests/Analysis/Scalable/CMakeFiles/ClangSca lableAnalysisFrameworkTests.dir/ASTEntityMappingTest.cpp.o: undefined re ference to symbol '_ZN5clang7ASTUnitD1Ev

…lvm#167754) This adjusts the behavior of running dap_server.py directly to better support the current state of development. A few parts of the 'main' body were stale and not functional. These improvements include: * Instead of the custom tracefile / replay file parsing logic, I adjusted the replay helper to handle parsing lldb-dap log files created with the `LLDBDAP_LOG` env variable, allowing you to more easily run a failing test like: `python3 lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py --adapter lldb-dap -r lldb-test-build.noindex/tools/lldb-dap/console/TestDAP_console.test_custom_escape_prefix/dap.txt` * Migrated argument parsing to `argparse`, that is in all verisons of py3+ and has a few improvements over `optparse`. * Corrected the existing arguments and updated `run_vscode` > `run_adapter`. You can use this for simple debugging like: `xcrun python3 lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py --adapter=lldb-dap --adapter-arg='--pre-init-command' --adapter-arg 'help' --program a.out --init-command 'help'`

Adds a flag COMPILER_RT_PROFILE_BAREMETAL, which disables the parts of the profile runtime which require a filesystem or malloc. This minimal library only requires string.h from the C library. This is useful for profiling or code coverage of baremetal images, which don't have filesystem APIs, and might not have malloc configured (or have limited heap space). Expected usage: - Add code to your project to call `__llvm_profile_get_size_for_buffer()` and `__llvm_profile_write_buffer()` to write the profile data to a buffer in memory, and then copy that data off the device using target-specific tools. - If you're using a linker script, set up your linker script to map the profiling and coverage input sections to corresponding output sections with the same name, and mark them KEEP. `__llvm_covfun` and `__llvm_covmap` are non-allocatable, `__llvm_prf_names` is read-only allocatable, and `__llvm_prf_cnts` and `__llvm_prf_data` are read-write allocatable. - The resulting data is in same format as the non-baremetal profiles. There's some room for improvement here in the future for doing profiling and code coverage for baremetal. If we revised the profiling format, and introduced some additional host tooling, we could move some of the metadata into non-allocated sections, and construct the profraw file on the host. But this patch is sufficient for some use-cases.

Use the same twiden format for PseudoSF_VSETTM and PseudoSF_VSETTK as other XSfmm pseudos. Though I don't think we use the operand from these instructions.

… peelToTurnInvariantLoadsDereferenceable. (llvm#171547) llvm.assume intrinsics have the mayWriteToMemory property, but won't prevent the load from becoming dereferenceable.

Add documentation for variadic `isa<>` in the LLVM Programmer's Manual.

…ge (llvm#171705) As it is done in `flang-rt/lib/runtime/edit-input.cpp`, emit a runtime error message when trying to raise IEEE exception on the device. `MapException` and `feraiseexcept` are used in the lowering of the nearest intrinsic even on the device.

…utable in popen.cpp" (llvm#171706) Reverts llvm#171622 Co-authored-by: Andrew Haberlandt <[email protected]>

Previously we would hit an assertion failure when a relocation represented by a PAuth ifunc required a GOT and the addend of a relocation did not fit into the immediate operand of an ADD instruction. Fix it by extracting a function for materializing arbitrary addends and using it to materialize the addend. Reviewers: fmayer, hvdijk Pull Request: llvm#171707

Previously we would assert when a ValueTypeByHwMode was missing a case for the current mode, now we report an error instead. Interestingly this error only ocurrs when the DAG patterns use RegClassByHwMode, but not normal RegisterClass instances. Found while I added RegClassByHwMode to RISC-V and was getting an assertion due to `XLenFVT`/`XLenVecI32VT` not having an entry for the default mode. Reviewed By: arsenm Pull Request: llvm#171254

Previously, we were emitting a broken AliasPatternCond array, outputting `MyTarget::RegClassByHwModeRegClassID` which does not exist. Instead, we now add a new predicate and pass the RegClassByHwMode index as the value argument. Pull Request: llvm#171264

Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170947

printFlags takes care of inserting the correct amount of spaces, depending on whether there are flags to print or not.

This pass implements the OpenACC loop tiling transformation for acc.loop operations that have the tile clause (OpenACC 3.4 spec, section 2.9.8). The tile clause specifies that the iterations of the associated loops should be divided into tiles (rectangular blocks). The pass transforms a single or nested acc.loop with tile clauses into a structure of "tile loops" (iterating over tiles) containing "element loops" (iterating within tiles). For example, tiling a 2-level nested loop with tile(T1, T2): ``` // Before tiling: acc.loop tile(T1, T2) control(%i, %j) = ... // After tiling: acc.loop control(%i) step (s1*T1) { // tile loop 1 acc.loop control(%j) step (s2*T2) { // tile loop 2 acc.loop control(%ii) = (%i) to (min(ub1, %i+s1*T1)) { acc.loop control(%jj) = (%j) to (min(ub2, %j+s2*T2)) { // loop body using %ii, %jj } } } } ``` Key features: - Handles constant tile sizes and wildcard tile sizes ('*') which use a configurable default tile size - Properly handles collapsed loops with tile counts exceeding collapse count by uncollapsing loops before tiling - Distributes gang/worker/vector attributes appropriately: gang -> tile loops, vector -> element loops - Validates that tile size types are not wider than loop IV types - Emits optimization remarks for tiling decisions Three test files are added: - acc-loop-tiling.mlir: Tests single and nested loop tiling with constant tile sizes, unknown tile sizes (*), and loops with collapse attributes - acc-loop-tiling-invalid.mlir: Tests error diagnostic when tile size type is wider than the loop IV type - acc-loop-tiling-remarks.mlir: Tests optimization remarks emitted for tiling decisions including default tile size selection Co-authored-by: Vijay Kandiah <[email protected]>

Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170950

llvm#169131 Should fix: ASTEntityMappingTest.cpp.o: undefined reference to symbol '_ZN4llvm3omp27isAllowedClauseForDirectiveENS0_9DirectiveENS0_6ClauseEj' https://lab.llvm.org/buildbot/#/builders/10/builds/18851

That hopefully concludes the initial upstreaming. Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170951

@GloBaRR

…llvm#169779) SelectionDAG uses the DAGCombiner to fold a load followed by a sext to a load and sext instruction. For example, in x86 we will see that ``` %1 = load i32, ptr @GloBaRR #dbg_value(i32 %1, !43, !DIExpression(), !52) %2 = sext i32 %1 to i64, !dbg !53 ``` is converted to: ``` %0:gr64_nosp = MOVSX64rm32 $rip, 1, $noreg, @GloBaRR, $noreg, debug-instr-number 1, debug-location !51 DBG_VALUE $noreg, $noreg, !"Idx", !DIExpression(), debug-location !52 ``` The `DBG_VALUE` needs to be transferred correctly to the new combined instruction, and it needs to be appended with a `DIExpression` which contains a `DW_OP_LLVM_fragment`, describing that the lower bits of the virtual register contain the value. This patch fixes the above described problem.

…1719)

This patch fixes issues introduced by llvm#171491 when running tests in CI. The shell tests expect certain characters when matching diagnostics. With llvm#171491, those characters can either be Unicode specific characters or their ASCII equivalent. The tests were always expecting the ASCII version. This patch fixes this by using a regex to match one or the other.

If the 'counted_by' value is signed, we will incorrectly allow accesses when the value is negative. This has obvious bad effects as it will allow accessing a huge swath of unallocated memory. Also clarify and rearrange the parameters to make them more perspicuous. Fixes: llvm#170987.

z1-cciauto · 2025-12-10T23:56:50Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/3221

z1-cciauto · 2025-12-11T01:19:58Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/3222

llvm#171745) … instrs. (llvm#169779)" This reverts commit 2b958b9. I might have broken the sanitizer-x86_64-linux bot /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_procmaps_linux.cpp clang++: /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:248: const T &llvm::ArrayRef<llvm::DbgValueLocEntry>::operator[](size_t) const [T = llvm::DbgValueLocEntry]: Assertion `Index < Length && "Invalid index!"' failed.

z1-cciauto · 2025-12-11T03:23:39Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/3223

vangthao95 and others added 30 commits December 10, 2025 08:58

[AMDGPU][GlobalISel] Add RegBankLegalize support for G_FPEXT (llvm#17…

d162afa

…1483)

[X86] shift-i512.ll - add load test coverage (llvm#171642)

16fbbac

[libc++][Github] Bump Runners to Next Group (llvm#168122)

05ef57f

To start using the more recently built containers.

[LLVM][IR] Add support for address space names in DataLayout (llvm#17…

0ab6a63

…0559) Add support for specifying the names of address spaces when specifying pointer properties for an address space. Update LLVM's AsmPrinter and LLParser to print and read these symbolic address space name.

[CIR][NFC] Rename AtomicFence to AtomicFenceOp (llvm#171248)

6cacbdb

This fixes missed suffix `Op` of `CIR_AtomicFence` defination and also improves API `makeAtomicFenceValue`.

[flang][OpenMP] Frontend support for DIMS modifier (llvm#171454)

1267488

Add parsing and semantic checks for DIMS modifier on NUM_TEAMS, NUM_THREADS, and THREAD_LIMIT.

[libc++][Github] Bump Runner Version to v2.330.0 (llvm#168753)

36a95a5

Bumps the runner version to keep things up to date (and prevent us from falling below the support horizon).

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in ViewOp…

f5c28bd

…Graph.cpp (NFC)

[lldb] Stop emitting pointless newline (NFC) (llvm#171531)

097ac33

AppendError ends up trimming this "\n" from the end of the string, then putting another on on. So there's no reason to keep appending the newline in CommandObjectMultiword::Execute.

[MLIR] Apply clang-tidy fixes for readability-simplify-boolean-expr i…

c1fd5ac

…n NVVMDialect.cpp (NFC)

Add the support for recognizing WTF::move like std::move (llvm#170820)

aca48a4

This PR adds support for treating WTF::move like std::move in various WebKit checkers.

[Flang][Pass]Remove dependence on CodeGen pass

4427e34

[CIR] Add support for ExpressionTraitExpr (llvm#171634)

ec4aba3

Add support for the ExpressionTraitExpr

Revert "[Hexagon] Passes for widening vector operations and shuffle o… (

48d942c

llvm#171647) …pt (llvm#169559)" This reverts commit 4da31b6.

[clang][unittest] Fix build break with BUILD_SHARED_LIBS=ON

62e00a0

/usr/bin/ld: tools/clang/unittests/Analysis/Scalable/CMakeFiles/ClangSca lableAnalysisFrameworkTests.dir/ASTEntityMappingTest.cpp.o: undefined re ference to symbol '_ZN5clang7ASTUnitD1Ev

[RISCV] Add OperandType for XSfmm TWiden. (llvm#171572)

c5ac7d6

Use the same twiden format for PseudoSF_VSETTM and PseudoSF_VSETTK as other XSfmm pseudos. Though I don't think we use the operand from these instructions.

[LoopPeel] Ignore assume intrinsics for the mayWriteToMemory check in…

ccc3835

… peelToTurnInvariantLoadsDereferenceable. (llvm#171547) llvm.assume intrinsics have the mayWriteToMemory property, but won't prevent the load from becoming dereferenceable.

jurahul and others added 19 commits December 10, 2025 13:19

[NFC][LLVM] Document variadic isa (llvm#136869)

e61c2d4

Add documentation for variadic `isa<>` in the LLVM Programmer's Manual.

Revert "[sanitizer_common][test-only] Specify full path for sort exec…

16ee5c7

…utable in popen.cpp" (llvm#171706) Reverts llvm#171622 Co-authored-by: Andrew Haberlandt <[email protected]>

[FlowSensitive] [StatusOr] [13/N] Add support for gtest ASSERTs

d34c717

Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170947

[VPlan] Strip stray whitespace when printing VPWidenSelectRecipe. (NFCI)

5a1299b

printFlags takes care of inserting the correct amount of spaces, depending on whether there are flags to print or not.

[AMDGPU] Implement codegen for GFX11+ V_CVT_PK_[IU]16_F32 (llvm#168719)

6ae0b9f

[FlowSensitive] [StatusOr] [14/N] Support nested StatusOrs

ddc638c

Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170950

[clang][ssaf] Speculative fix of unittest build (llvm#171720)

6b75eae

llvm#169131 Should fix: ASTEntityMappingTest.cpp.o: undefined reference to symbol '_ZN4llvm3omp27isAllowedClauseForDirectiveENS0_9DirectiveENS0_6ClauseEj' https://lab.llvm.org/buildbot/#/builders/10/builds/18851

[FlowSensitive] [StatusOr] [15/15] Support references to Status(Or) ptrs

e603fac

That hopefully concludes the initial upstreaming. Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170951

[NFC] [FlowSensitive] [StatusOr] expose statusType in header (llvm#17…

56fb92a

…1719)

[bazel] Port 575d689 (llvm#171722)

e305cf2

merge main into amd-staging

3e99778

z1-cciauto requested a review from a team December 10, 2025 23:55

merge conflict TTI.getInstructionUniformity

e23f2c9

ronlieb approved these changes Dec 11, 2025

View reviewed changes

ronlieb added this to the ain milestone Dec 11, 2025

ronlieb requested a review from a team December 11, 2025 03:48

z1-cciauto merged commit 2fbe932 into amd-staging Dec 11, 2025
19 checks passed

z1-cciauto deleted the upstream_merge_202512101855 branch December 11, 2025 06:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #812

merge main into amd-staging #812

Uh oh!

z1-cciauto commented Dec 10, 2025

Uh oh!

z1-cciauto commented Dec 10, 2025

Uh oh!

z1-cciauto commented Dec 11, 2025

Uh oh!

z1-cciauto commented Dec 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

merge main into amd-staging #812

merge main into amd-staging #812

Uh oh!

Conversation

z1-cciauto commented Dec 10, 2025

Uh oh!

z1-cciauto commented Dec 10, 2025

Uh oh!

z1-cciauto commented Dec 11, 2025

Uh oh!

z1-cciauto commented Dec 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants