-
Notifications
You must be signed in to change notification settings - Fork 795
LLVM and SPIRV-LLVM-Translator pulldown (WW47 2025) #20664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
iclsrc
wants to merge
1,303
commits into
sycl
Choose a base branch
from
llvmspirv_pulldown
base: sycl
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. The comments for these variables are also present in: mlir/include/mlir/Dialect/Bufferization/IR/BufferizationBase.td Identified with readability-redundant-declaration.
…er Box types (#165954) Currently we handle BoxChars separately and a little differently to the other BoxType's, however realistically they can be handled the same and should be to simplify the pass as much as we can.
Fix compare function in getAllDBDirs(). The compare function in sort should be strictly less than operator.
... but the constant expression test didn't allow for them, so they weren't working in initializers.
When an overflow or other floating-point exception occurs at compilation time while folding a conversion of a math library call to a smaller type, don't confuse the user by mentioning the conversion; just note that the exception was noted while folding the intrinsic function.
…66847) List-directed child input is allowed to advance to new records in some circumstances, and list-directed output should be as well so that e.g. NAMELIST output via a defined WRITE(FORMATTED) generic doesn't get truncated by FORT_FMT_RECL. Fixes llvm/llvm-project#166804.
This fixes incompleteness and inconsistency for test files added in adc7932, by - renaming `libcxx/test/std/time/time.traits.is.clock/trait.is.clock.compile.pass.cpp` to `libcxx/test/std/time/time.traits/is.clock.compile.pass.cpp`, - renaming `libcxx/test/libcxx/time/time.traits.is.clock/trait.is.clock.compile.verify.cpp` to `libcxx/test/libcxx/time/time.traits/is.clock.verify.cpp` , and - adding comments clarifying what are being tested.
- 'getStmtExprResult' is removed after d9c7c76. Use the original one to get the compound stmt's expr result.
This patch is limited to single-word replacements to fix spelling and/or grammar to ease the review process. Punctuation and markdown fixes are specifically excluded.
…ith known test coverage (#161553) Add a number of simple target shuffles (fixed shuffle mask or simple immediate control) to isGuaranteedNotToBeUndefOrPoison/canCreateUndefOrPoisonForTargetNode that have known test coverage and obviously don't introduce undef/poison. These were found by adding an assert for unhandled target shuffles and running over CodeGen/X86 - providing explicit test coverage is incredibly difficult as ISD::VECTOR_SHUFFLE nodes will typically handle freeze nodes before we lower to these target shuffle nodes.
Only the fortran source files in flang/test have been modified. The other files in the directory will be cleaned up in subsequent commits
…ly call" (#166970) Reverts llvm/llvm-project#166769 Darwin platforms prefix symbols with `_`, other platforms don't necessarily.
#166513 broke the lowering of `arith.select` with unsupported FP4 types. For this op, it is fine to convert to `i4`.
…emote (#166869) We weren't setting `m_should_detach` when going through the `DoConnectRemote` code path. This meant that when you would attaches to a remote process with `gdb-remote <port>` and use Ctrl+D, it would kill the process instead of detach from it. rdar://156111423
This patch fixes the unexpected result in monotonicity check for `@step_is_variant` in `monotonicity-no-wrap-flags.ll`. Currently, the SCEV is considered non-monotonic if it contains an expression which is neither loop-invariant nor an affine addrec. In `@step_is_variant`, the `offset_i` satisfies this condition, but `offset_i + j` was classified as monotonic. The root cause is that a non-outermost loop was passed to monotonicity checker instead of the outermost one. This patch ensures that the correct outermost loop is passed.
…truction (#165898)
Add a command-line tool `llvm-cas` to inspect the OnDisk CAS for debugging purpose. It can be used to lookup/update ObjectStore or put/get cache entries from ActionCache, together with other debugging capabilities.
Pre-commit shlN_add test results with sdag. --------- Signed-off-by: John Lu <[email protected]> Co-authored-by: Matt Arsenault <[email protected]>
…sExtFNeg (#126408) insertelt DestVec, (fneg (extractelt SrcVec, Index)), Index -> shuffle DestVec, (shuffle (fneg SrcVec), poison, SrcMask), Mask In previous, the above transform was only possible if the Extract/Insert Index was the same; this patch makes the above transform possible even if the two indexes are different. Proof: https://alive2.llvm.org/ce/z/aDfdyG Fixes: llvm/llvm-project#125675
…66960) Eliminate compilation error related to missing exception specification 'noexcept(true)' for at_quick_exit function in C++11.
Add nvvm.membar operation with level as defined in https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-membar This will be used to replace direct intrinsic call in CUDA Fortran for `threadfence()`, `threadfence_block` and `thread fence_system()` currently lowered here: https://github.com/llvm/llvm-project/blob/e700f157026bf8b4d58f936c5db8f152e269d77f/flang/lib/Optimizer/Builder/CUDAIntrinsicCall.cpp#L1310 The nvvm membar intrsinsic are also used in CUDA C/C++ (https://github.com/llvm/llvm-project/blob/49f55f4991227f3c7a2b8161bbf45c74b7023944/clang/lib/Headers/__clang_cuda_device_functions.h#L528)
This patch adds a job to the bazel checks workflow to run the bazel build/test. This patch only tests a couple projects just to get things going. The plan is to expand to more projects eventually and setup a GCS bucket for caching so jobs complete quickly by using cached artifacts. This should add minimal load to the CI given the low frequency of bazel PRs, and especially when we enable GCS based caching due to bazel's effective use of caching. Google is also sponsoring the Linux Premerge CI and is interested in having premerge bazel builds which is why it makes sense to do premerge testing of this alternative build system using those resources.
Add in SVE/SVE2/MOPS features for aarch64 cpus. These features may be interesting for future memory/math routines. SVE/SVE2 are now being accepted in more implementations: ``` ❯ echo | clang-21 -dM -E - -march=native | grep -i ARM_FEAT #define __ARM_FEATURE_ATOMICS 1 #define __ARM_FEATURE_BF16 1 #define __ARM_FEATURE_BF16_SCALAR_ARITHMETIC 1 #define __ARM_FEATURE_BF16_VECTOR_ARITHMETIC 1 #define __ARM_FEATURE_BTI 1 #define __ARM_FEATURE_CLZ 1 #define __ARM_FEATURE_COMPLEX 1 #define __ARM_FEATURE_CRC32 1 #define __ARM_FEATURE_DIRECTED_ROUNDING 1 #define __ARM_FEATURE_DIV 1 #define __ARM_FEATURE_DOTPROD 1 #define __ARM_FEATURE_FMA 1 #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 #define __ARM_FEATURE_FRINT 1 #define __ARM_FEATURE_IDIV 1 #define __ARM_FEATURE_JCVT 1 #define __ARM_FEATURE_LDREX 0xF #define __ARM_FEATURE_MATMUL_INT8 1 #define __ARM_FEATURE_NUMERIC_MAXMIN 1 #define __ARM_FEATURE_PAUTH 1 #define __ARM_FEATURE_QRDMX 1 #define __ARM_FEATURE_RCPC 1 #define __ARM_FEATURE_SVE 1 #define __ARM_FEATURE_SVE2 1 #define __ARM_FEATURE_SVE_BF16 1 #define __ARM_FEATURE_SVE_MATMUL_INT8 1 #define __ARM_FEATURE_SVE_VECTOR_OPERATORS 2 #define __ARM_FEATURE_UNALIGNED 1 ``` MOPS is another set of extension for string operations, but may not be generally available for now: ``` ❯ echo | clang-21 -dM -E - -march=armv9.2a+mops | grep -i MOPS #define __ARM_FEATURE_MOPS 1 ```
`xevm.blockprefetch2d` op has pointer operand marked as MemRead. And that causes the op got get folded away be canonicalize pass. Remove the side effect mark and update XeGPU to XeVM prefetch op conversion test cases to use canonicalize pass.
…ing referred-to types into synthetic name (#166767)
**TL;DR:** See #166675 for description of the problem, the root cause,
and one solution. This patch is the "different implementation" descried
there.
This patch tries to fix the problem by recursively including the
referred-to types into the synthetic name. This way, the synthetic name
of the typedef DIE is canonicalized. See example debug prints below:
```
SyntheticTypeNameBuilder::addDIETypeName() is called for DIE at offset 0x0000004c
SyntheticName = {H}BarInt{F}Foo<int>:() <- Two different names
Assigned to type descriptor. TypeEntryPtr = 0x0000004c0x0x150020a38 <- Hence different type entries
SyntheticTypeNameBuilder::addDIETypeName() is called for DIE at offset 0x00000044
SyntheticName = {H}BarInt{H}BarInt{F}Foo<int>:() <- Two different names
Assigned to type descriptor. TypeEntryPtr = 0x000000440x0x150020a60 <- Hence different type entries
```
The advantages of this approach over
llvm/llvm-project#166675 are:
1. The resulting synthetic name is more "correct" than using decl file
and line (which _can_ still collide).
1. This doesn't depend on
llvm/llvm-project#166673 to be fixed.
A **hypothetical** caveat is that it would work if any of the referenced
types resolve to the same name for some reason (similar to how the two
typedefs resolved to the same name before this patch).
# Tests
```
cd ~/public_llvm/build
bin/llvm-lit -a \
../llvm-project/llvm/test/tools/dsymutil/typedefs-with-same-name.test \
../llvm-project/llvm/test/tools/dsymutil/X86/DWARFLinkerParallel/odr-fwd-declaration.test
```
This adds code that was previously missing from emitAutoVarAlloca to identify when an aggregate auto var is being emitted with a constant initializer, and the associated code that is called from emitAutoVarInit to store the constant. This allows significantly more efficient initialization.
We're iterating over the stop hooks so if one of them changes the stop hook list by deleting itself or another stop hook, that invalidates our iterator. I chose to fix this by making a copy of the stop hook list and iterating over that. That's a cheap operation since this is just an array of shared pointers. But more importantly doing it this way means that if on a stop where one stop hook deletes another, the deleted hook will always get a chance to run. If you iterated over the list itself, then whether that to be deleted hook gets to run would be dependent on whether it was before or after the deleting stop hook, which would be confusing.
It's already used for the field in the struct.
… (#167323) Makes it much easier to workout what still needs to be converted to be constexpr compatible
…(#166926) For predicated SVE instructions where we know that the inactive lanes are undef, it is better to pick a destination register that is not unique. This avoids introducing a movprfx to copy a unique register to the destination operand, which would be needed to comply with the tied operand constraints. For example: ``` %src1 = COPY $z1 %src2 = COPY $z2 %dst = SDIV_ZPZZ_S_UNDEF %p, %src1, %src2 ``` Here it is beneficial to pick $z1 or $z2 as the destination register, because if it would have chosen a unique register (e.g. $z0) then the pseudo expand pass would need to insert a MOVPRFX to expand the operation into: ``` $z0 = SDIV_ZPZZ_S_UNDEF $p0, $z1, $z2 -> $z0 = MOVPRFX $z1 $z0 = SDIV_ZPmZ_S $p0, $z0, $z2 ``` By picking $z1 directly, we'd get: ``` $z1 = SDIV_ZPmZ_S, $p0 $z1, $z2 ```
CONFLICT (content): Merge conflict in llvm/lib/Target/SPIRV/SPIRVAsmPrinter.cpp
CONFLICT (content): Merge conflict in llvm/lib/Passes/PassBuilderPipelines.cpp
CONFLICT (content): Merge conflict in clang/lib/CodeGen/CGCall.cpp
CONFLICT (content): Merge conflict in llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.cpp
stdout: 'CONFLICT (modify/delete): .ci/generate_test_report_github.py deleted in HEAD and modified in 811fe02. Version 811fe02 of .ci/generate_test_report_github.py left in tree. CONFLICT (modify/delete): .github/workflows/premerge.yaml deleted in HEAD and modified in 811fe02. Version 811fe02 of .github/workflows/premerge.yaml left in tree. CONFLICT (modify/delete): .github/workflows/release-binaries.yml deleted in HEAD and modified in 811fe02. Version 811fe02 of .github/workflows/release-binaries.yml left in tree. CONFLICT (content): Merge conflict in llvm/docs/SPIRVUsage.rst CONFLICT (content): Merge conflict in llvm/lib/Target/SPIRV/SPIRVSymbolicOperands.td
It should be a function-level OpVariable, not Global var. Fixes KhronosGroup/SPIRV-LLVM-Translator#3402 Signed-off-by: Dmitry Sidorov <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@8ef804541897a80
5d4a581 to
5f5d631
Compare
Conflicts: clang/lib/Driver/ToolChains/SYCL.cpp llvm/lib/Target/SPIRV/SPIRVAsmPrinter.cpp llvm/lib/Target/SPIRV/SPIRVSymbolicOperands.td
acea892 to
a4f0121
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: llvm/llvm-project@7fe60a7
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@cf8265055b4981d