Skip to content

Conversation

@ronlieb
Copy link
Collaborator

@ronlieb ronlieb commented Dec 3, 2025

No description provided.

RKSimon and others added 30 commits December 2, 2025 16:08
…vm#169917)

New pass manager does not use TargetPassConfig.
GlobalISel requires TargetPassConfig to reportGISelFailure,
and it only actual use is to check if GlobalISelAbort is enabled.
TargetPassConfig uses TargetMachine to check if GlobalISelAbort is
enabled, but TargetMachine is also available from MachineFunction.
The test changes are mostly GlobalISel specific regressions.
GlobalISel is still relying on isUniformMMO, but it doesn't really
have an excuse for doing so. These should be avoidable with new
regbankselect.

There is an additional regression for addrspacecast for cov4. We
probably ought to be using a separate PseudoSourceValue for the
access of the queue pointer.
`ResourceDirectoryCache::findResourceDir` uses a `std::vector` when a
`std::array` would do.
This requires an x86 build, otherwise the test will fail with:

```
Error running ThinLTO backend: No available targets are compatible with triple "x86_64-unknown-linux-gnu"
```
This patch addresses issues identified by the static analyzers, which
appear to be legitimate problems.

`FloatLoopCounterCheck.cpp`: "Dereferencing a pointer that might be
`nullptr` FS when calling `getInc`".
`ProBoundsAvoidUncheckedContainerAccessCheck.cpp`: "Dereferencing a
pointer that might be `nullptr Callee` when calling `getBeginLoc`".
`ExpandModularHeadersPPCallbacks.cpp`: Non-static class member
`CurrentToken.Flags` is not initialized in this constructor nor in any
functions that it calls. (line #101).
…lvm#170307)

This is a partial fix for the rocm device-libs build. This
was most likely broken by 423bdb2
Vector registers have synthetic values for display purposes. This causes
SBValue::GetExpressionPath to dispatch
to ValueObjectSynthetic instead of ValueObjectRegister, producing
incorrect results.

Fixes llvm#147144
…70326)

4394aa6 introduced the test
amdgcn_weak_alias, which is failing on the reverse iteration build, due
to the the order of the aliasees being different. This failure is a test
issue, not a bug, so the metadata checks are removed.
)

Currently we try to hoist the transformed IV increment instruction to
the header block to help with generation of postincrement instructions,
but this only works if the user instruction is also in the header. We
should instead be trying to insert it in the same block as the user.
…#169795)

Similar to llvm#169156 again, this is mostly for denormal handling as there
is no rounding step in a minnum/maxnum.
Add additional tests where extra no-alias checks are needed, as future
extensions of llvm#168771.
This moves a few existing debug info flags that were floating in the
general pool of unorganised flags over to the existing groups for debug
info flags (so that they are presented together in documentation).

As a tiny further tweak, this also fixes the spelling of "DWARF" in the
flag docs for consistency with other flags.
One of the previous PRs
llvm#169267 has reintroduced block
count to layout propagation that was removed in
llvm#168504. This PR patches the
issue.
Adding llvm:Support dep since plugin started using llvm/ADT/...
)

In Debug builds, the names of adjusted pointers have a pointer-specific
name prefix which doesn't exist in non-debug builds.

This causes differences in output when looking at the output of SROA
with a Debug or Release compiler.

For most of our ongoing testing, we use essentially Release+Asserts
build (basically release but without NDEBUG defined), however we ship a
Release compiler. Therefore we want to say with reasonable confidence
that building a large project with Release vs a Release+Asserts build
gives us the same output when the same compiler version is used.

This difference however, makes it difficult to prove that the output is
the same if the only difference is the name when using LTO builds and
looking at bitcode.

Hence this change is being proposed.
I had a case where the frontend was generating a zero elem array in
non-shader code so it was just crashing in a release build.
Add a real error and make it not crash.

---------

Signed-off-by: Nick Sarnie <[email protected]>
…ual (llvm#170191)

This change makes StackFrame methods virtual to enable subclass
overrides and introduces BorrowedStackFrame, a wrapper that presents an
existing StackFrame with a different frame index.

This enables creating synthetic frame views or renumbering frames
without copying the underlying frame data, which is useful for frame
manipulation scenarios.

This also adds a new borrowed-info format entity to show what was the
original frame index of the borrowed frame.

Signed-off-by: Med Ismail Bennani <[email protected]>
dmpots and others added 20 commits December 2, 2025 11:13
This commit modifies the dwarf expression evaluator in how we handle the
deref operation for register and implicit locations on the stack. For a
typical memory location a deref operation will read the value from
memory. For register and implicit locations the deref operation will
read the value from the register or its implicit location. In lldb we
eagerly read register and implicit values and push them on the stack so
the deref operation for these becomes a "no-op" that leaves the value on
the stack and updates the tracked location kind.

The motivation for this change is to handle `DW_OP_deref*` operations on
location descriptions as described by the heterogenious debugging
[extensions](https://rocm.docs.amd.com/projects/llvm-project/en/latest/LLVM/llvm/html/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#a-2-5-4-4-4-register-location-description-operations).

Specifically, for register locations it states

> These operations obtain a register location. To fetch the contents of
> a register, it is necessary to use DW_OP_regval_type, use one of the
> DW_OP_breg* register-based addressing operations, or use DW_OP_deref*
on
> a register location description.

My understanding is that this is the intended behavior from dwarf5 as
well and is not a change in behavior.
The 'routine' construct just adds a acc.routine element to the global
module, which contains all of the information about the directive. it
contains a reference to the function, which also contains a reference to
the acc.routine, which this generates.

This handles both the implicit-func version (where the routine is
    spelled without parens, and just applies to the next function) and
the explicit-func version (where the routine is spelled with the func
    name in parens).

The AST stores the directive in an OpenACCRoutineDeclAttr in the
implicit case, so we can emit that when we hit the function declaration.
The explicit case is held in an OpenACCRoutineAnnotAttr on the function,
however, when we emit the function we haven't necessarily seen the
construct yet, so we can't depend on that attribute. Instead, we save up
the list in Sema so that we can emit them all at the end.

This results in the tests getting really hard to read (because ordering
is a little awkward based on spelling, with no way to fix it), so we
instead split the tests up based on topic.

One last thing: Flang spends some time determining if the clause lists
of two routines on the same function are identical, and omits the
duplicates. However, it seems to do a poor job on this when the ordering
isn't the same, or references are slightly different. This patch doesn't
bother trying that, and instead emits all, trusting the ACC dialect to
remove duplicates/handle duplicates gracefully.

Note; This doesn't cause emission of functions that would otherwise not
be emitted, but DOES emit routine references based on which function
they are attached to.
This patch extends the OpenACC PointerLikeType interface with two new
methods for generating load and store operations, enabling
dialect-agnostic memory access patterns.

New Interface Methods:
- genLoad(builder, loc, srcPtr, valueType): Generates a load operation
from a pointer-like value. Returns the loaded value.

- genStore(builder, loc, valueToStore, destPtr): Generates a store
operation to a pointer-like value.

Implementations provided for FIR pointer-like types, memref type (rank-0
only), and LLVM pointer types.

Extended TestPointerLikeTypeInterface.cpp with 'load' and 'store' test
modes.
Reducing spurious diff in an upcoming change.
This moves a call inside an assert to avoid a warning about the result
variable being unused in release builds.
This reverts commit e719e93.

revert this since it caused regression in our internal CI.

Deduction guide with host/device attrs have already been
used in

https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocrand/library/src/rng/utils/cpp_utils.hpp#L249

```
template<class V>
__host__ __device__ vec_wrapper(V) -> vec_wrapper<V>;
```
…0358)

These two are both incredibly similar and simple, basically identical to
'seq'. This patch adds them both together.
Adding the following dependencies to PluginScriptedProcess:
-         "//lldb:CoreHeaders",
-         "//lldb:SymbolHeaders",
-         "//llvm:Support",

For c50802c
This upstreams the handler for the BI__builtin_constant_p function.
Commit b262785 introduced a separate `AnalysisFpExc` target to try to
workaround the lack of a bazel equivalent of single source file
properties. However, this introduces backref errors when
`--warn-backrefs` is enabled.

This change alternatively just adds the `-ftrapping-math` copt to the
entire `Analysis` target.

Fix suggested by @rocallahan.
…del (llvm#168270)

The VPlan-based cost model assigns the forced cost once for a whole
VPInterleaveRecipe. Update the legacy cost model to match this behavior.
This fixes a cost-model divergence, and assigns the cost in a way that
matches the generated code more accurately.

PR: llvm#168270
This clause is pretty small/trivial and is a simple 'set a bool' value
on the IR node, so its implementation is quite simple. We create the
Operation with this as 'false', so the 'nohost' marks it as true always.
Remove a redundant duplicated computeCost call. NFC, just skipping an
unneeded call.
Shared memory for TMA operation needs to be align to 16. Add ability to
set an alignment on the cuf.shared_memory operation.
…lvm#170350)

Updates `InitializeRequestArguments` to correctly follow the spec, see
https://microsoft.github.io/debug-adapter-protocol/specification#Requests_Initialize.

This should correct which fields are tracked as optional and simplifies
some of the types to make sure they're meaningful (e.g. an
`optional<bool>` isn't anymore helpful than a `bool` since undefined and
false are basically equivalent and it requires us to handle interpreting undefined as the default value in all the places we use the `optional<bool>`).
This change fixes couple of issues with static resources:
- Enables assignment to static resource or resource array variables (fixes llvm#166458)
- Initializes static resources and resource arrays with default constructor that sets the handle to poison
@ronlieb ronlieb requested review from a team and dpalermo December 3, 2025 01:51
@z1-cciauto
Copy link
Collaborator

@ronlieb ronlieb removed request for krzysz00 and kuhar December 3, 2025 01:54
@z1-cciauto z1-cciauto merged commit 87c9379 into amd-staging Dec 3, 2025
20 checks passed
@z1-cciauto z1-cciauto deleted the amd/merge/upstream_merge_20251202185441 branch December 3, 2025 04:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.