Skip to content

Conversation

@z1-cciauto
Copy link
Collaborator

No description provided.

vangthao95 and others added 30 commits December 10, 2025 08:58
These fallback layouts are essentially guesses. Used when there is
no other way to query register information from the debug server.

Therefore there is a risk that LLDB and the debug server disagree,
which can produce strange effects.

I have added a log message here so we have a clue when triaging
these problems.

Note that it's not wrong to assume a layout in some situations.
It's how some debug servers were built. However if you end up
using the fallback when the server expected you to use XML,
you're likely going to have a bad time.
To start using the more recently built containers.
…0559)

Add support for specifying the names of address spaces when specifying
pointer properties for an address space. Update LLVM's AsmPrinter and
LLParser to print and read these symbolic address space name.
This fixes missed suffix `Op` of `CIR_AtomicFence` defination and also
improves API `makeAtomicFenceValue`.
Add parsing and semantic checks for DIMS modifier on NUM_TEAMS,
NUM_THREADS, and THREAD_LIMIT.
Add core abstractions for identifying program entities across
compilation and link unit boundaries in the Scalable Static Analysis
Framework (SSAF).

Introduces three key components:
- BuildNamespace: Represents build artifacts (compilation units, link
units)
- EntityName: Globally unique entity identifiers across compilation
boundaries
- AST mapping: Functions to map Clang AST declarations to EntityNames

Entity identification uses Unified Symbol Resolution (USR) as the
underlying mechanism, with extensions for sub-entities (parameters,
return values) via suffixes. The abstraction allows whole-program
analysis by providing stable identifiers that persist across separately
compiled translation units.
Patch disables memory intrinsics expansion, enabled by default in
llvm#168622. This patch does the
same in clang, but not in flang.

The expansion causes massive perf regressions, up to 2x times in
fortran code.

Reviewers: jeanPerier, vzakhari

Reviewed By: vzakhari

Pull Request: llvm#171650
…ysis (llvm#168903)

This patch introduces a new TargetTransformInfo hook
`getInstructionUniformity()`
that provides a unified interface for querying target-specific
uniformity
information about instructions and values.

The new hook returns an `InstructionUniformity` enum with three values:
- Default: Result is uniform if all operands are uniform (standard
propagation)
- AlwaysUniform: Result is always uniform regardless of operands
- NeverUniform: Result can never be assumed uniform

This API wraps the existing `isAlwaysUniform()` and
`isSourceOfDivergence()`
hooks, providing a single entry point for uniformity queries. Both LLVM
IR-level
(via TTI) and MIR-level (via TargetInstrInfo) uniformity analysis have
been
updated to use the new hook.

Target implementations:
- AMDGPU: Wraps existing `isAlwaysUniform()` and
`isSourceOfDivergence()` hooks
- NVPTX: Wraps existing `isSourceOfDivergence()` hook

This is an NFC change - all implementations return conservative defaults
or wrap
existing functionality.

Ref patch:llvm#137639
Bumps the runner version to keep things up to date (and prevent us from
falling below the support horizon).
AppendError ends up trimming this "\n" from the end of the string, then
putting another on on. So there's no reason to keep appending the
newline in CommandObjectMultiword::Execute.
…MT distribution with partial offsets. (llvm#171512)

`vector.extract_strided_slice` can have two forms when specifying
offsets.

Case 1:
```
%1 = vector.extract_strided_slice %0 { offsets = [8, 0], sizes = [8, 16], strides = [1, 1]}
      : vector<24x16xf32> to vector<8x16xf32>
```

Case 2:
```
%1 = vector.extract_strided_slice %0 { offsets = [8], sizes = [8], strides = [1]}
      : vector<24x16xf32> to vector<8x16xf32>
```

These two ops means the same thing, but case 2 is syntactic sugar to
avoid specifying offsets for fully extracted dims. Currently case 2
fails in XeGPU SIMT distribution. This PR fixes this issue.
If a veneer is not disassembled in lite mode, the veneer elimination
pass will not recognize it as such and the call to such veneer will
remain unchanged.

Later, we may need to insert a new veneer for such code ending up with a
double veneer.

To avoid such suboptimal code generation, always disassemble veneers and
guarantee that they are converted to direct calls in BOLT.
This seems the standard way to get the path to such tools within LLVM.
Calling findBuiltClang() has some annoying behavior like falling back to
CC when it cannot find anything else, which might point to anything or
not even be set.

We noticed this with our internal build system as the lli binary is not
in the same path as the clang binary.
…n popen.cpp (llvm#171622)

This test has begun failing on iossim with 'sh: sort: command not found'
in the stderr. I believe this may be due to the change to the lit
internal shell not having 'sort' in it's path.

This patch adds the full path /usr/bin/sort to work around this.
…m#171579)

Assign output sections for injected functions explicitly, and don't
reassign in AssignSections pass.

This change is a prerequisite for further PRs where veneer functions are
created as injected functions and their code section depends on their
placement.
…#171639)

Besides simplifying the code, this refactor should also make it more
efficient: instead of using the `hasAnySubstatement` matcher to find
blocks we're interested in, which requires looking through every
substatement, this PR introduces a custom `hasFinalStmt` matcher which
only checks the last substatement.
This PR adds support for treating WTF::move like std::move in various
WebKit checkers.
Add support for the ExpressionTraitExpr
This obsoletes the FIXME in llvm#85686, but it doesn't address the issue
where moves from CCR will still be emitted on 68000. However, all such
moves will now be emitted as physreg copies, and the issue can thus be
handled there in a followup change.
/usr/bin/ld: tools/clang/unittests/Analysis/Scalable/CMakeFiles/ClangSca
lableAnalysisFrameworkTests.dir/ASTEntityMappingTest.cpp.o: undefined re
ference to symbol '_ZN5clang7ASTUnitD1Ev
…lvm#167754)

This adjusts the behavior of running dap_server.py directly to better
support the current state of development. A few parts of the 'main' body
were stale and not functional.

These improvements include:

* Instead of the custom tracefile / replay file parsing logic, I
adjusted the replay helper to handle parsing lldb-dap log files created
with the `LLDBDAP_LOG` env variable, allowing you to more easily run a
failing test like: `python3
lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
--adapter lldb-dap -r
lldb-test-build.noindex/tools/lldb-dap/console/TestDAP_console.test_custom_escape_prefix/dap.txt`
* Migrated argument parsing to `argparse`, that is in all verisons of
py3+ and has a few improvements over `optparse`.
* Corrected the existing arguments and updated `run_vscode` >
`run_adapter`. You can use this for simple debugging like: `xcrun
python3 lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
--adapter=lldb-dap --adapter-arg='--pre-init-command' --adapter-arg
'help' --program a.out --init-command 'help'`
Adds a flag COMPILER_RT_PROFILE_BAREMETAL, which disables the parts of
the profile runtime which require a filesystem or malloc. This minimal
library only requires string.h from the C library.

This is useful for profiling or code coverage of baremetal images, which
don't have filesystem APIs, and might not have malloc configured (or
have limited heap space).

Expected usage:

- Add code to your project to call
`__llvm_profile_get_size_for_buffer()` and
`__llvm_profile_write_buffer()` to write the profile data to a buffer in
memory, and then copy that data off the device using target-specific
tools.
- If you're using a linker script, set up your linker script to map the
profiling and coverage input sections to corresponding output sections
with the same name, and mark them KEEP. `__llvm_covfun` and
`__llvm_covmap` are non-allocatable, `__llvm_prf_names` is read-only
allocatable, and `__llvm_prf_cnts` and `__llvm_prf_data` are read-write
allocatable.
- The resulting data is in same format as the non-baremetal profiles.

There's some room for improvement here in the future for doing profiling
and code coverage for baremetal. If we revised the profiling format, and
introduced some additional host tooling, we could move some of the
metadata into non-allocated sections, and construct the profraw file on
the host. But this patch is sufficient for some use-cases.
Use the same twiden format for PseudoSF_VSETTM and PseudoSF_VSETTK
as other XSfmm pseudos. Though I don't think we use the operand from
these instructions.
… peelToTurnInvariantLoadsDereferenceable. (llvm#171547)

llvm.assume intrinsics have the mayWriteToMemory property, but
won't prevent the load from becoming dereferenceable.
jurahul and others added 19 commits December 10, 2025 13:19
Add documentation for variadic `isa<>` in the LLVM Programmer's Manual.
…ge (llvm#171705)

As it is done in `flang-rt/lib/runtime/edit-input.cpp`, emit a runtime
error message when trying to raise IEEE exception on the device.
`MapException` and `feraiseexcept` are used in the lowering of the
nearest intrinsic even on the device.
Previously we would hit an assertion failure when a relocation represented
by a PAuth ifunc required a GOT and the addend of a relocation did not fit
into the immediate operand of an ADD instruction. Fix it by extracting a
function for materializing arbitrary addends and using it to materialize
the addend.

Reviewers: fmayer, hvdijk

Pull Request: llvm#171707
Previously we would assert when a ValueTypeByHwMode was missing a case
for the current mode, now we report an error instead. Interestingly this
error only ocurrs when the DAG patterns use RegClassByHwMode, but not
normal RegisterClass instances. Found while I added RegClassByHwMode
to RISC-V and was getting an assertion due to `XLenFVT`/`XLenVecI32VT`
not having an entry for the default mode.

Reviewed By: arsenm

Pull Request: llvm#171254
Previously, we were emitting a broken AliasPatternCond array, outputting
`MyTarget::RegClassByHwModeRegClassID` which does not exist.
Instead, we now add a new predicate and pass the RegClassByHwMode index
as the value argument.

Pull Request: llvm#171264
Reviewers: jvoung

Reviewed By: jvoung

Pull Request: llvm#170947
printFlags takes care of inserting the correct amount of spaces,
depending on whether there are flags to print or not.
This pass implements the OpenACC loop tiling transformation for acc.loop
operations that have the tile clause (OpenACC 3.4 spec, section 2.9.8).

The tile clause specifies that the iterations of the associated loops
should be divided into tiles (rectangular blocks). The pass transforms a
single or nested acc.loop with tile clauses into a structure of "tile
loops" (iterating over tiles) containing "element loops" (iterating
within tiles).

For example, tiling a 2-level nested loop with tile(T1, T2):
```
  // Before tiling:
  acc.loop tile(T1, T2) control(%i, %j) = ...

  // After tiling:
  acc.loop control(%i) step (s1*T1) {        // tile loop 1
    acc.loop control(%j) step (s2*T2) {      // tile loop 2
      acc.loop control(%ii) = (%i) to (min(ub1, %i+s1*T1)) {
        acc.loop control(%jj) = (%j) to (min(ub2, %j+s2*T2)) {
          // loop body using %ii, %jj
        }
      }
    }
  }
```

Key features:
- Handles constant tile sizes and wildcard tile sizes ('*') which use a
configurable default tile size
- Properly handles collapsed loops with tile counts exceeding collapse
count by uncollapsing loops before tiling
- Distributes gang/worker/vector attributes appropriately: gang -> tile
loops, vector -> element loops
- Validates that tile size types are not wider than loop IV types
- Emits optimization remarks for tiling decisions

Three test files are added:
- acc-loop-tiling.mlir: Tests single and nested loop tiling with
constant tile sizes, unknown tile sizes (*), and loops with collapse
attributes
- acc-loop-tiling-invalid.mlir: Tests error diagnostic when tile size
type is wider than the loop IV type
- acc-loop-tiling-remarks.mlir: Tests optimization remarks emitted for
tiling decisions including default tile size selection

Co-authored-by: Vijay Kandiah <[email protected]>
Reviewers: jvoung

Reviewed By: jvoung

Pull Request: llvm#170950
llvm#169131

Should fix:
ASTEntityMappingTest.cpp.o: undefined reference to symbol
'_ZN4llvm3omp27isAllowedClauseForDirectiveENS0_9DirectiveENS0_6ClauseEj'

https://lab.llvm.org/buildbot/#/builders/10/builds/18851
That hopefully concludes the initial upstreaming.

Reviewers: jvoung

Reviewed By: jvoung

Pull Request: llvm#170951
…llvm#169779)

SelectionDAG uses the DAGCombiner to fold a load followed by a sext to a
load and sext instruction. For example, in x86 we will see that

```
%1 = load i32, ptr @GloBaRR
  #dbg_value(i32 %1, !43, !DIExpression(), !52)
%2 = sext i32 %1 to i64, !dbg !53
```

is converted to:

```
%0:gr64_nosp = MOVSX64rm32 $rip, 1, $noreg, @GloBaRR, $noreg, debug-instr-number 1, debug-location !51
DBG_VALUE $noreg, $noreg, !"Idx", !DIExpression(), debug-location !52
```

The `DBG_VALUE` needs to be transferred correctly to the new combined
instruction, and it needs to be appended with a `DIExpression` which
contains a `DW_OP_LLVM_fragment`, describing that the lower bits of the
virtual register contain the value.

This patch fixes the above described problem.
This patch fixes issues introduced by
llvm#171491 when running tests in
CI.

The shell tests expect certain characters when matching diagnostics.
With llvm#171491, those characters
can either be Unicode specific characters or their ASCII equivalent. The
tests were always expecting the ASCII version. This patch fixes this by
using a regex to match one or the other.
If the 'counted_by' value is signed, we will incorrectly allow accesses
when the value is negative. This has obvious bad effects as it will
allow accessing a huge swath of unallocated memory.

Also clarify and rearrange the parameters to make them more
perspicuous.

Fixes: llvm#170987.
@z1-cciauto z1-cciauto requested a review from a team December 10, 2025 23:55
@z1-cciauto
Copy link
Collaborator Author

@z1-cciauto
Copy link
Collaborator Author

llvm#171745)

… instrs. (llvm#169779)"

This reverts commit 2b958b9.

I might have broken the sanitizer-x86_64-linux bot


/home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_procmaps_linux.cpp
clang++:
/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:248:
const T &llvm::ArrayRef<llvm::DbgValueLocEntry>::operator[](size_t)
const [T = llvm::DbgValueLocEntry]: Assertion `Index < Length &&
"Invalid index!"' failed.
@z1-cciauto
Copy link
Collaborator Author

@ronlieb ronlieb added this to the ain milestone Dec 11, 2025
@ronlieb ronlieb requested a review from a team December 11, 2025 03:48
@z1-cciauto z1-cciauto merged commit 2fbe932 into amd-staging Dec 11, 2025
19 checks passed
@z1-cciauto z1-cciauto deleted the upstream_merge_202512101855 branch December 11, 2025 06:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.