forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 57
Sync with Microsoft ONNX Runtime - 10/10/2025 #826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Description Fixes microsoft#11523. Scalars should just be nops for tile. Simply removing the overvalidation lets the case work. ### Motivation and Context Conformance with expectation.
### Description Previously, local window size of GQA op excluded the current token. This does not match standard HuggingFace implementations where tokens are appended and then local masking occurs; the mismatch can cause the mask to be off by 1 during generation, leading to accuracy issues. This PR corrects this mismatch by including the current token. In practice, this effectively decreases GQA window size by 1. ### Motivation and Context This helps align our models with HuggingFace models. --------- Co-authored-by: Kunal Vaishnavi <[email protected]>
… definitions. (microsoft#26225) ### Description <!-- Describe your changes. --> Add CMake `mlas_private_compile_definitions` variable for internal MLAS definitions. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Refactor so that it is easier to add new private compile definitions for MLAS and related targets.
### Description <!-- Describe your changes. --> - Introduce a new command-line flag `--skip_pip_install`, similar to the existing `--skip_submodule_sync`, to allow users to prevent `build.py` from modifying their current Python environment. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - This is useful for scenarios where dependencies are already managed externally or when users want to avoid unintended changes during the build process.
Bumps [gradle/actions](https://github.com/gradle/actions) from 4 to 5. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/gradle/actions/releases">gradle/actions's releases</a>.</em></p> <blockquote> <h2>v5.0.0</h2> <h2>What's Changed</h2> <h3>Breaking Changes</h3> <ul> <li>Upgrade to node 24 by <a href="https://github.com/amyu"><code>@amyu</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/721">gradle/actions#721</a></li> </ul> <p>Make sure your runner is updated to this version or newer to use this release. v2.327.1 <a href="https://github.com/actions/runner/releases/tag/v2.327.1">Release Notes</a></p> <h3>Dependency upgrades</h3> <ul> <li>Bump the github-actions group across 1 directory with 2 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/748">gradle/actions#748</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/gradle/actions/compare/v4...v5.0.0">https://github.com/gradle/actions/compare/v4...v5.0.0</a></p> <h2>v4.4.4</h2> <h2>What's Changed</h2> <ul> <li>Bump the github-actions group across 2 directories with 3 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/726">gradle/actions#726</a></li> <li>Regenerating package lock by <a href="https://github.com/cdsap"><code>@cdsap</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/729">gradle/actions#729</a></li> <li>Update known wrapper checksums by <a href="https://github.com/github-actions"><code>@github-actions</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/730">gradle/actions#730</a></li> <li>Bump the github-actions group across 1 directory with 3 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/735">gradle/actions#735</a></li> <li>Bump the gradle group across 3 directories with 1 update by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/734">gradle/actions#734</a></li> <li>Bump the npm-dependencies group in /sources with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/733">gradle/actions#733</a></li> <li>Bump references to Develocity Gradle plugin from 4.1.1 to 4.2 by <a href="https://github.com/bot-githubaction"><code>@bot-githubaction</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/736">gradle/actions#736</a></li> <li>Handle gracefully parse errors in checksum file by <a href="https://github.com/jprinet"><code>@jprinet</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/737">gradle/actions#737</a></li> <li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in /.github/workflow-samples/kotlin-dsl by <a href="https://github.com/bot-githubaction"><code>@bot-githubaction</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/742">gradle/actions#742</a></li> <li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in /.github/workflow-samples/java-toolchain by <a href="https://github.com/bot-githubaction"><code>@bot-githubaction</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/741">gradle/actions#741</a></li> <li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in /.github/workflow-samples/groovy-dsl by <a href="https://github.com/bot-githubaction"><code>@bot-githubaction</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/740">gradle/actions#740</a></li> <li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in /.github/workflow-samples/gradle-plugin by <a href="https://github.com/bot-githubaction"><code>@bot-githubaction</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/739">gradle/actions#739</a></li> <li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in /sources/test/init-scripts by <a href="https://github.com/bot-githubaction"><code>@bot-githubaction</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/738">gradle/actions#738</a></li> <li>Update known wrapper checksums by <a href="https://github.com/github-actions"><code>@github-actions</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/743">gradle/actions#743</a></li> <li>Bump com.google.guava:guava from 33.4.8-jre to 33.5.0-jre in /.github/workflow-samples/kotlin-dsl in the gradle group across 1 directory by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/746">gradle/actions#746</a></li> <li>Bump the npm-dependencies group in /sources with 5 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/745">gradle/actions#745</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/gradle/actions/compare/v4...v4.4.4">https://github.com/gradle/actions/compare/v4...v4.4.4</a></p> <h2>v4.4.3</h2> <h2>What's Changed</h2> <ul> <li>Adapt tests to future new Build Scan publication message by <a href="https://github.com/alextu"><code>@alextu</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/708">gradle/actions#708</a></li> <li>Add missing Gradle version input to setup-gradle by <a href="https://github.com/jprinet"><code>@jprinet</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/713">gradle/actions#713</a></li> <li>Bump the github-actions group across 2 directories with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/710">gradle/actions#710</a></li> <li>Bump references to Develocity Gradle plugin from 4.1 to 4.1.1 by <a href="https://github.com/bot-githubaction"><code>@bot-githubaction</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/712">gradle/actions#712</a></li> <li>Update known wrapper checksums by <a href="https://github.com/github-actions"><code>@github-actions</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/709">gradle/actions#709</a></li> <li>Bump the npm-dependencies group across 1 directory with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/711">gradle/actions#711</a></li> <li>Do not run setup-gradle post action if workflow is cancelled by <a href="https://github.com/jprinet"><code>@jprinet</code></a> in <a href="https://redirect.github.com/gradle/actions/pull/716">gradle/actions#716</a></li> <li>Bump the github-actions group across 2 directories with 2 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/715">gradle/actions#715</a></li> <li>Bump the npm-dependencies group across 1 directory with 3 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/720">gradle/actions#720</a></li> <li>Bump github/codeql-action from 3.29.11 to 3.30.0 in the github-actions group across 1 directory by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/719">gradle/actions#719</a></li> <li>Bump com.fasterxml.jackson.dataformat:jackson-dataformat-smile from 2.19.2 to 2.20.0 in /sources/test/init-scripts in the gradle group across 1 directory by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/718">gradle/actions#718</a></li> <li>Update known wrapper checksums by <a href="https://github.com/github-actions"><code>@github-actions</code></a>[bot] in <a href="https://redirect.github.com/gradle/actions/pull/723">gradle/actions#723</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/gradle/actions/commit/4d9f0ba0025fe599b4ebab900eb7f3a1d93ef4c2"><code>4d9f0ba</code></a> Bump the github-actions group across 1 directory with 2 updates (<a href="https://redirect.github.com/gradle/actions/issues/748">#748</a>)</li> <li><a href="https://github.com/gradle/actions/commit/4b530e369bfef1ac8fc2160ec97b9fda1ccd9901"><code>4b530e3</code></a> Bump the github-actions group across 1 directory with 2 updates</li> <li><a href="https://github.com/gradle/actions/commit/e60655a8a03bf3b9a7ff400dc5ef49bed725bec8"><code>e60655a</code></a> Upgrade to node 24 (<a href="https://redirect.github.com/gradle/actions/issues/721">#721</a>)</li> <li>See full diff in <a href="https://github.com/gradle/actions/compare/v4...v5">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
### Description Enable handling the a few new EP options.
### Description Move packaging pipelines' macOS build jobs to a new machine pool that is much faster, which can reduce the build time by 85%. However, it has some drawbacks: 1. It does not have Java installed. Also, ADO's JavaToolInstaller task does not support macOS. So I have to remove these things from the pipeline. We will stop providing java binaries for macOS until this issue is resolved. 2. It does not Cocoapod. But Cocoapod itself is being deprecated. So it is not a big deal.
The `check_emulator_running_using_avd_name` function should not assume adb is in PATH. This PR fixes the issue.
### Description Add -Wno-deprecated-declarations when compiling unit test with TRT/NV EP Support ### Motivation and Context Fixes source build for these EPs. Signed-off-by: Kevin Chen <[email protected]>
…osoft#26230) Users with RTX 5090 GPUs are experiencing runtime errors when using onnxruntime-gpu: ``` [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Slice node. Name:'Slice_34' Status Message: CUDA error cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device ``` This occurs because RTX 5090 uses CUDA compute architecture 12.0 (SM 12.0). The incompatibility of `onnxruntime-gpu` 1.23 was built with `90a-virtual`. The `90a` architecture is a specialized, non-forward-compatible version of the Hopper architecture, making it incompatible with future GPU generations like Blackwell. This change will revert `90a-virtual` back to `90-virtual` as used in 1.22. This shall bring back the compatibility in Blackwell GPU. The FPA_INTB_GEMM is disabled by default. It need some extra work to make it compatible with 90-virtual and no 90a-real use case. Related: microsoft#26002 microsoft#26226 microsoft#26181
…26222) This pull request introduces a minor update to the test infrastructure and test files, focusing on improving build configuration and test utility usage. Build configuration: * Updated `cmake/onnxruntime_test_pch.cmake` to only enable precompiled headers for test targets when not performing a minimal build, preventing unnecessary PCH usage in minimal build scenarios. Test utility improvements: * Added an include for `asserts.h` in `onnxruntime/test/platform/file_io_test.cc`, ensuring test assertions are available and improving test reliability. --------- Co-authored-by: Copilot <[email protected]>
…soft#26253) ### Description <!-- Describe your changes. --> Add check for ARM64 SME to MlasDynamicQGemmBatch() unit tests. Now, ARM64 SME is required or else MlasDynamicQGemmBatch() is a no-op. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix unit test failures.
…26155) ### Description <!-- Describe your changes. --> Update Linux device discovery to add rudimentary detection of whether a GPU is discrete and include that in the device metadata. Now, an Nvidia GPU is considered to be discrete. We can refine this later. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Include more potentially useful metadata.
### Description Update the version number in the main branch to 1.24 ### Motivation and Context To replace PR microsoft#25545 . The old one stuck on CLA check. --------- Co-authored-by: vraspar <[email protected]>
### Description Upgrade torch version to address component governance alert
…BufferBindingSize limit (microsoft#25962) ### Description When an input is bigger than maxStorageBufferBindingSize, use multiple binding entries for it. We refine the implement for `getByOffset`/`setByOffset` so that let's say, `input_b` is 257MB, but maxStorageBufferBindingSize is 256MB, we can use `b.getByOffset(offset)` to get the correct element and no need to care about the different binding entry. Actually, it will generate shader code like this. ``` var<storage, read> input_b: array<vec4<u32>>; // [0, 256MB) of input_b var<storage, read> input_b1: array<vec4<u32>>; // [256MB, 257MB) of input_b ``` ### Motivation and Context QC's maxStorageBufferBindingSize is 256MB, which is not enough for phi-4 model. So for QC, we customized a new phi4 model which use `slice` op to split the big matrix. That means we need to keep two different phi4 model for different platform. ### For reviewers The core logic is located - Shader side: - `shader_helper.cc`. In shader, use more`@group(0) @binding(....` matched the actual buffer numbers. - `shader_variable.cc`. Implement `set_xxx_by_offset(global_offset, value)` and `get_xxx_by_offset(global_offset)` shader helper function, which will be used when using `setByOffset`/`getByOffset` and the input exceed the maxstoragebuffersize. - WebGPU API side: - `webgpu_context.cc`. In WebGPU API, use more group entry matched the actual buffer numbers.
### Description When there is an optional input (empty input type) in the OrtShapeInferContext construction, use undefined data type and empty shape as a placeholder. ### Motivation and Context VitisAI EP may add nodes with optional inputs during graph optimization to meet the requirements of AMD AI compilers. This fix may help other execution providers to improve the graph optimization process.
…oft#26227) Updated enforcement conditions to allow any dimensional tensor with a single element according to the ONNX spec. Fix microsoft#26218 Fix microsoft#26265 --------- Signed-off-by: Justin Chu <[email protected]>
### Description Enable handling the following EP options: - preferredLayout - forceCpuNodeNames - device - enableGraphCapture (this is set in session options, not EP options. but it is eventually passed to ORT as EP options.)
…#26268) Fixes chatterbox model for transformers.js
ankitm3k
approved these changes
Oct 10, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Synchronizing intel/onnxruntime ovep-develop branch with latest changes from microsoft/onnxruntime master branch.