forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 75
merge main into amd-staging #547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…tion's Signature (llvm#167248) Since llvm#162441, `buffer-results-to-out-params` transforms `private` functions only. But, as mentioned in llvm#162441 (comment), this is a breaking change for pipelines handling C code. Our pipeline @EfficientComputer is also affected by this breaking change. Therefore, this PR adds an opt-in flag to allow `public` functions to be transformed by `BufferResultsToOutParamsPass`.
) Unfortunately this is more dynamic than anticipated. Fixes llvm#165006
Fill out more information for sign and zero extend and add some truncate information; however, the primary change is to int/fp conversions. In particular, fp to (narrow) int appears to be relatively expensive.
spec: https://github.com/riscv/riscv-isa-manual/blob/smpmpmt/src/smpmpmt.adoc Co-Authored-by: Jesse Huang <[email protected]>
…ble (llvm#165525) When Polly generates a false runtime condition (RTC), the associated Polly generated loop is never executed and is eventually eliminated. As a result, the fallback loop becomes the default execution path. Disabling vectorization for this fallback loop will be counterproductive. This patch ensures that vectorization is only disabled when the RTC is not false (no Codegen failure).
…tructor-throws' (llvm#164061) Closes llvm#157299. --------- Co-authored-by: Victor Chernyakin <[email protected]>
…s generated (llvm#166910) This patch doesn't change anything. Just adds more explicit checks to verify what is generated in this case when an alloca has a zero-sized array. I'd expect an `OpRuntimeArray`, but nothing is generated.
…cape' (llvm#164081) Need these options to complete llvm#160825, but I think it's generally beneficial to fine-tune this check. --------- Co-authored-by: EugeneZelenko <[email protected]> Co-authored-by: Victor Chernyakin <[email protected]>
…m#167258) For the non-built-in vector type, the RISCV cost model cannot handle this properly. So fall back to the BasicTTI for this situation. Fixes: llvm#166732
…m#167214) `__tuple_types` is at this point just a `__type_list` with a weird name, so we can just replace the few places it's still used.
With `+SPV_KHR_float_controls2` and when there is a non-int `OpConstantNull` we would call `MI.getOperand(1).getImm()` when `MI` was not an `OpTypeInt` (the associated test has an `OpTypeArray` zeroinitialized). Under this conditions an assertion is triggered. This patch adds the missing condition.
…lvm#165863) Extracts of unsigned i8 or i16 elements from the bottom 128 bits of a scalable register lead to the implied zero-extend being transformed to an AND mask. The mask is redundant since UMOV already zeroes the high bits of the destination register. For example: ```c int foo(svuint8_t x) { return x[3]; } ``` Currently: ```gas foo: umov w8, v0.b[3] and w0, w8, #0xff ret ``` Becomes: ```gas foo: umov w0, v0.b[3] ret ```
Specifically, this patch adds the following combines: SUB x, (CSET LO, (CMP a, b)) -> SBC x, 0, (CMP a, b) SUB (SUB x, y), (CSET LO, (CMP a, b)) -> SBC x, y, (CMP a, b) The CSET may be preceded by a ZEXT. Fixes llvm#164748.
Call getVectorTripCount first, and call getTripCount failing that, in simplifyBranchConditionForVFAndUF, to simplify missed cases. While at it, strip the dead check for a zero TC.
…lvm#166947) This patch adds another run of DropUnnecessaryAssumes after vectorization, to clean up assumes that are not longer needed after this point. The main example of such an assume is currently dereferenceable assumptions. This complements llvm#166945, which avoids sinking code if it would mean remove a dereferenceable assumption. There are a few additional cases where some unneeded assumes are left over after vectorization that also get cleaned up. The main motivation is to work together with llvm#166945, but there may be a better solution. Adding another instance of this pass to the pipeline is not great, but compile-time impact seems in the noise: https://llvm-compile-time-tracker.com/compare.php?from=55e71fe08b6406ec7ce2c81ce042e48717acf204&to=85da4ee3a74126f557cdc74c7b40e048dacb3fc4&stat=instructions:u PR: llvm#166947
llvm#166756) Section C3.2.2 (quoted below) in the ARMARM makes this a requirement of assemblers for load/stores with unscaled offset. It makes no mention of PRFM so I don't consider this to be a bug, although I can see why we would want to extend this behaviour to the unscaled variants of these instructions as well, as GCC does. This patch adds an alias for this. C3.2.2 Load/store register (unscaled offset) The load/store register instructions with an unscaled offset support only one addressing mode: Base plus an unscaled 9-bit signed immediate offset. See Load/store addressing modes. The load/store register (unscaled offset) instructions are required to disambiguate this instruction class from the load/store register instruction forms that support an addressing mode of base plus a scaled, unsigned 12-bit immediate offset, because that can represent some offset values in the same range. The ambiguous immediate offsets are byte offsets that are both: In the range 0-255, inclusive. Naturally aligned to the access size. Other byte offsets in the range -256 to 255 inclusive are unambiguous. An assembler program translating a load/store instruction, for example LDR, is required to encode an unambiguous offset using the unscaled 9-bit offset form, and to encode an ambiguous offset using the scaled 12-bit offset form. A programmer might force the generation of the unscaled 9-bit form by using one of the mnemonics in Table C.3.21. Arm recommends that a disassembler outputs all unscaled 9-bit offset forms using one of these mnemonics, but unambiguous offsets can be output using a load/store single register mnemonic, for example, LDR. Fixes llvm#83226.
…61049) Several components in libc++ aren't defending against overloaded `operator,(T, Iter)` currently. Existing deleted overloads in `test_iterators.h` are insufficient for such cases. This PR adds corresponding deleted overloads with reversed order and fixes these libc++ components. - `piecewise_linear_distribution`'s iterator pair constructor, - `piecewise_linear_distribution::param_type`'s iterator pair constructor, - `piecewise_constant_distribution`'s iterator pair constructor, - `piecewise_constant_distribution::param_type`'s iterator pair constructor, - `money_get::do_get`, - `money_put::do_put`, and - `num_put::do_put`.
Collaborator
Author
ronlieb
approved these changes
Nov 10, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.