forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These fallback layouts are essentially guesses. Used when there is no other way to query register information from the debug server. Therefore there is a risk that LLDB and the debug server disagree, which can produce strange effects. I have added a log message here so we have a clue when triaging these problems. Note that it's not wrong to assume a layout in some situations. It's how some debug servers were built. However if you end up using the fallback when the server expected you to use XML, you're likely going to have a bad time.
To start using the more recently built containers.
…0559) Add support for specifying the names of address spaces when specifying pointer properties for an address space. Update LLVM's AsmPrinter and LLParser to print and read these symbolic address space name.
This fixes missed suffix `Op` of `CIR_AtomicFence` defination and also improves API `makeAtomicFenceValue`.
Add parsing and semantic checks for DIMS modifier on NUM_TEAMS, NUM_THREADS, and THREAD_LIMIT.
Add core abstractions for identifying program entities across compilation and link unit boundaries in the Scalable Static Analysis Framework (SSAF). Introduces three key components: - BuildNamespace: Represents build artifacts (compilation units, link units) - EntityName: Globally unique entity identifiers across compilation boundaries - AST mapping: Functions to map Clang AST declarations to EntityNames Entity identification uses Unified Symbol Resolution (USR) as the underlying mechanism, with extensions for sub-entities (parameters, return values) via suffixes. The abstraction allows whole-program analysis by providing stable identifiers that persist across separately compiled translation units.
Patch disables memory intrinsics expansion, enabled by default in llvm#168622. This patch does the same in clang, but not in flang. The expansion causes massive perf regressions, up to 2x times in fortran code. Reviewers: jeanPerier, vzakhari Reviewed By: vzakhari Pull Request: llvm#171650
…ysis (llvm#168903) This patch introduces a new TargetTransformInfo hook `getInstructionUniformity()` that provides a unified interface for querying target-specific uniformity information about instructions and values. The new hook returns an `InstructionUniformity` enum with three values: - Default: Result is uniform if all operands are uniform (standard propagation) - AlwaysUniform: Result is always uniform regardless of operands - NeverUniform: Result can never be assumed uniform This API wraps the existing `isAlwaysUniform()` and `isSourceOfDivergence()` hooks, providing a single entry point for uniformity queries. Both LLVM IR-level (via TTI) and MIR-level (via TargetInstrInfo) uniformity analysis have been updated to use the new hook. Target implementations: - AMDGPU: Wraps existing `isAlwaysUniform()` and `isSourceOfDivergence()` hooks - NVPTX: Wraps existing `isSourceOfDivergence()` hook This is an NFC change - all implementations return conservative defaults or wrap existing functionality. Ref patch:llvm#137639
Bumps the runner version to keep things up to date (and prevent us from falling below the support horizon).
AppendError ends up trimming this "\n" from the end of the string, then putting another on on. So there's no reason to keep appending the newline in CommandObjectMultiword::Execute.
…MT distribution with partial offsets. (llvm#171512) `vector.extract_strided_slice` can have two forms when specifying offsets. Case 1: ``` %1 = vector.extract_strided_slice %0 { offsets = [8, 0], sizes = [8, 16], strides = [1, 1]} : vector<24x16xf32> to vector<8x16xf32> ``` Case 2: ``` %1 = vector.extract_strided_slice %0 { offsets = [8], sizes = [8], strides = [1]} : vector<24x16xf32> to vector<8x16xf32> ``` These two ops means the same thing, but case 2 is syntactic sugar to avoid specifying offsets for fully extracted dims. Currently case 2 fails in XeGPU SIMT distribution. This PR fixes this issue.
If a veneer is not disassembled in lite mode, the veneer elimination pass will not recognize it as such and the call to such veneer will remain unchanged. Later, we may need to insert a new veneer for such code ending up with a double veneer. To avoid such suboptimal code generation, always disassemble veneers and guarantee that they are converted to direct calls in BOLT.
This seems the standard way to get the path to such tools within LLVM. Calling findBuiltClang() has some annoying behavior like falling back to CC when it cannot find anything else, which might point to anything or not even be set. We noticed this with our internal build system as the lli binary is not in the same path as the clang binary.
…n popen.cpp (llvm#171622) This test has begun failing on iossim with 'sh: sort: command not found' in the stderr. I believe this may be due to the change to the lit internal shell not having 'sort' in it's path. This patch adds the full path /usr/bin/sort to work around this.
…n NVVMDialect.cpp (NFC)
…m#171579) Assign output sections for injected functions explicitly, and don't reassign in AssignSections pass. This change is a prerequisite for further PRs where veneer functions are created as injected functions and their code section depends on their placement.
…#171639) Besides simplifying the code, this refactor should also make it more efficient: instead of using the `hasAnySubstatement` matcher to find blocks we're interested in, which requires looking through every substatement, this PR introduces a custom `hasFinalStmt` matcher which only checks the last substatement.
This PR adds support for treating WTF::move like std::move in various WebKit checkers.
Add support for the ExpressionTraitExpr
llvm#171647) …pt (llvm#169559)" This reverts commit 4da31b6.
This obsoletes the FIXME in llvm#85686, but it doesn't address the issue where moves from CCR will still be emitted on 68000. However, all such moves will now be emitted as physreg copies, and the issue can thus be handled there in a followup change.
/usr/bin/ld: tools/clang/unittests/Analysis/Scalable/CMakeFiles/ClangSca lableAnalysisFrameworkTests.dir/ASTEntityMappingTest.cpp.o: undefined re ference to symbol '_ZN5clang7ASTUnitD1Ev
…lvm#167754) This adjusts the behavior of running dap_server.py directly to better support the current state of development. A few parts of the 'main' body were stale and not functional. These improvements include: * Instead of the custom tracefile / replay file parsing logic, I adjusted the replay helper to handle parsing lldb-dap log files created with the `LLDBDAP_LOG` env variable, allowing you to more easily run a failing test like: `python3 lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py --adapter lldb-dap -r lldb-test-build.noindex/tools/lldb-dap/console/TestDAP_console.test_custom_escape_prefix/dap.txt` * Migrated argument parsing to `argparse`, that is in all verisons of py3+ and has a few improvements over `optparse`. * Corrected the existing arguments and updated `run_vscode` > `run_adapter`. You can use this for simple debugging like: `xcrun python3 lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py --adapter=lldb-dap --adapter-arg='--pre-init-command' --adapter-arg 'help' --program a.out --init-command 'help'`
Adds a flag COMPILER_RT_PROFILE_BAREMETAL, which disables the parts of the profile runtime which require a filesystem or malloc. This minimal library only requires string.h from the C library. This is useful for profiling or code coverage of baremetal images, which don't have filesystem APIs, and might not have malloc configured (or have limited heap space). Expected usage: - Add code to your project to call `__llvm_profile_get_size_for_buffer()` and `__llvm_profile_write_buffer()` to write the profile data to a buffer in memory, and then copy that data off the device using target-specific tools. - If you're using a linker script, set up your linker script to map the profiling and coverage input sections to corresponding output sections with the same name, and mark them KEEP. `__llvm_covfun` and `__llvm_covmap` are non-allocatable, `__llvm_prf_names` is read-only allocatable, and `__llvm_prf_cnts` and `__llvm_prf_data` are read-write allocatable. - The resulting data is in same format as the non-baremetal profiles. There's some room for improvement here in the future for doing profiling and code coverage for baremetal. If we revised the profiling format, and introduced some additional host tooling, we could move some of the metadata into non-allocated sections, and construct the profraw file on the host. But this patch is sufficient for some use-cases.
Use the same twiden format for PseudoSF_VSETTM and PseudoSF_VSETTK as other XSfmm pseudos. Though I don't think we use the operand from these instructions.
… peelToTurnInvariantLoadsDereferenceable. (llvm#171547) llvm.assume intrinsics have the mayWriteToMemory property, but won't prevent the load from becoming dereferenceable.
Add documentation for variadic `isa<>` in the LLVM Programmer's Manual.
…ge (llvm#171705) As it is done in `flang-rt/lib/runtime/edit-input.cpp`, emit a runtime error message when trying to raise IEEE exception on the device. `MapException` and `feraiseexcept` are used in the lowering of the nearest intrinsic even on the device.
…utable in popen.cpp" (llvm#171706) Reverts llvm#171622 Co-authored-by: Andrew Haberlandt <[email protected]>
Previously we would hit an assertion failure when a relocation represented by a PAuth ifunc required a GOT and the addend of a relocation did not fit into the immediate operand of an ADD instruction. Fix it by extracting a function for materializing arbitrary addends and using it to materialize the addend. Reviewers: fmayer, hvdijk Pull Request: llvm#171707
Previously we would assert when a ValueTypeByHwMode was missing a case for the current mode, now we report an error instead. Interestingly this error only ocurrs when the DAG patterns use RegClassByHwMode, but not normal RegisterClass instances. Found while I added RegClassByHwMode to RISC-V and was getting an assertion due to `XLenFVT`/`XLenVecI32VT` not having an entry for the default mode. Reviewed By: arsenm Pull Request: llvm#171254
Previously, we were emitting a broken AliasPatternCond array, outputting `MyTarget::RegClassByHwModeRegClassID` which does not exist. Instead, we now add a new predicate and pass the RegClassByHwMode index as the value argument. Pull Request: llvm#171264
Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170947
printFlags takes care of inserting the correct amount of spaces, depending on whether there are flags to print or not.
This pass implements the OpenACC loop tiling transformation for acc.loop
operations that have the tile clause (OpenACC 3.4 spec, section 2.9.8).
The tile clause specifies that the iterations of the associated loops
should be divided into tiles (rectangular blocks). The pass transforms a
single or nested acc.loop with tile clauses into a structure of "tile
loops" (iterating over tiles) containing "element loops" (iterating
within tiles).
For example, tiling a 2-level nested loop with tile(T1, T2):
```
// Before tiling:
acc.loop tile(T1, T2) control(%i, %j) = ...
// After tiling:
acc.loop control(%i) step (s1*T1) { // tile loop 1
acc.loop control(%j) step (s2*T2) { // tile loop 2
acc.loop control(%ii) = (%i) to (min(ub1, %i+s1*T1)) {
acc.loop control(%jj) = (%j) to (min(ub2, %j+s2*T2)) {
// loop body using %ii, %jj
}
}
}
}
```
Key features:
- Handles constant tile sizes and wildcard tile sizes ('*') which use a
configurable default tile size
- Properly handles collapsed loops with tile counts exceeding collapse
count by uncollapsing loops before tiling
- Distributes gang/worker/vector attributes appropriately: gang -> tile
loops, vector -> element loops
- Validates that tile size types are not wider than loop IV types
- Emits optimization remarks for tiling decisions
Three test files are added:
- acc-loop-tiling.mlir: Tests single and nested loop tiling with
constant tile sizes, unknown tile sizes (*), and loops with collapse
attributes
- acc-loop-tiling-invalid.mlir: Tests error diagnostic when tile size
type is wider than the loop IV type
- acc-loop-tiling-remarks.mlir: Tests optimization remarks emitted for
tiling decisions including default tile size selection
Co-authored-by: Vijay Kandiah <[email protected]>
Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170950
llvm#169131 Should fix: ASTEntityMappingTest.cpp.o: undefined reference to symbol '_ZN4llvm3omp27isAllowedClauseForDirectiveENS0_9DirectiveENS0_6ClauseEj' https://lab.llvm.org/buildbot/#/builders/10/builds/18851
That hopefully concludes the initial upstreaming. Reviewers: jvoung Reviewed By: jvoung Pull Request: llvm#170951
…llvm#169779) SelectionDAG uses the DAGCombiner to fold a load followed by a sext to a load and sext instruction. For example, in x86 we will see that ``` %1 = load i32, ptr @GloBaRR #dbg_value(i32 %1, !43, !DIExpression(), !52) %2 = sext i32 %1 to i64, !dbg !53 ``` is converted to: ``` %0:gr64_nosp = MOVSX64rm32 $rip, 1, $noreg, @GloBaRR, $noreg, debug-instr-number 1, debug-location !51 DBG_VALUE $noreg, $noreg, !"Idx", !DIExpression(), debug-location !52 ``` The `DBG_VALUE` needs to be transferred correctly to the new combined instruction, and it needs to be appended with a `DIExpression` which contains a `DW_OP_LLVM_fragment`, describing that the lower bits of the virtual register contain the value. This patch fixes the above described problem.
This patch fixes issues introduced by llvm#171491 when running tests in CI. The shell tests expect certain characters when matching diagnostics. With llvm#171491, those characters can either be Unicode specific characters or their ASCII equivalent. The tests were always expecting the ASCII version. This patch fixes this by using a regex to match one or the other.
If the 'counted_by' value is signed, we will incorrectly allow accesses when the value is negative. This has obvious bad effects as it will allow accessing a huge swath of unallocated memory. Also clarify and rearrange the parameters to make them more perspicuous. Fixes: llvm#170987.
Collaborator
Author
ronlieb
approved these changes
Dec 11, 2025
Collaborator
Author
llvm#171745) … instrs. (llvm#169779)" This reverts commit 2b958b9. I might have broken the sanitizer-x86_64-linux bot /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_procmaps_linux.cpp clang++: /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:248: const T &llvm::ArrayRef<llvm::DbgValueLocEntry>::operator[](size_t) const [T = llvm::DbgValueLocEntry]: Assertion `Index < Length && "Invalid index!"' failed.
Collaborator
Author
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.