Skip to content

Conversation

@Jaswanth51
Copy link

Description

Synchronizing intel/onnxruntime ovep-develop branch with latest changes from microsoft/onnxruntime master branch.

fdwr and others added 21 commits October 1, 2025 17:31
### Description
Fixes microsoft#11523. Scalars should just be nops for tile. Simply removing the
overvalidation lets the case work.

### Motivation and Context
Conformance with expectation.
### Description
Previously, local window size of GQA op excluded the current token. This
does not match standard HuggingFace implementations where tokens are
appended and then local masking occurs; the mismatch can cause the mask
to be off by 1 during generation, leading to accuracy issues. This PR
corrects this mismatch by including the current token. In practice, this
effectively decreases GQA window size by 1.
 


### Motivation and Context
This helps align our models with HuggingFace models.

---------

Co-authored-by: Kunal Vaishnavi <[email protected]>
… definitions. (microsoft#26225)

### Description
<!-- Describe your changes. -->

Add CMake `mlas_private_compile_definitions` variable for internal MLAS
definitions.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Refactor so that it is easier to add new private compile definitions for
MLAS and related targets.
### Description
<!-- Describe your changes. -->
- Introduce a new command-line flag `--skip_pip_install`, similar to the
existing `--skip_submodule_sync`, to allow users to prevent `build.py`
from modifying their current Python environment.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
- This is useful for scenarios where dependencies are already managed
externally or when users want to avoid unintended changes during the
build process.
Bumps [gradle/actions](https://github.com/gradle/actions) from 4 to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/gradle/actions/releases">gradle/actions's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Upgrade to node 24 by <a
href="https://github.com/amyu"><code>@​amyu</code></a> in <a
href="https://redirect.github.com/gradle/actions/pull/721">gradle/actions#721</a></li>
</ul>
<p>Make sure your runner is updated to this version or newer to use this
release. v2.327.1 <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<h3>Dependency upgrades</h3>
<ul>
<li>Bump the github-actions group across 1 directory with 2 updates by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/748">gradle/actions#748</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/gradle/actions/compare/v4...v5.0.0">https://github.com/gradle/actions/compare/v4...v5.0.0</a></p>
<h2>v4.4.4</h2>
<h2>What's Changed</h2>
<ul>
<li>Bump the github-actions group across 2 directories with 3 updates by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/726">gradle/actions#726</a></li>
<li>Regenerating package lock by <a
href="https://github.com/cdsap"><code>@​cdsap</code></a> in <a
href="https://redirect.github.com/gradle/actions/pull/729">gradle/actions#729</a></li>
<li>Update known wrapper checksums by <a
href="https://github.com/github-actions"><code>@​github-actions</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/730">gradle/actions#730</a></li>
<li>Bump the github-actions group across 1 directory with 3 updates by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/735">gradle/actions#735</a></li>
<li>Bump the gradle group across 3 directories with 1 update by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/734">gradle/actions#734</a></li>
<li>Bump the npm-dependencies group in /sources with 4 updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/733">gradle/actions#733</a></li>
<li>Bump references to Develocity Gradle plugin from 4.1.1 to 4.2 by <a
href="https://github.com/bot-githubaction"><code>@​bot-githubaction</code></a>
in <a
href="https://redirect.github.com/gradle/actions/pull/736">gradle/actions#736</a></li>
<li>Handle gracefully parse errors in checksum file by <a
href="https://github.com/jprinet"><code>@​jprinet</code></a> in <a
href="https://redirect.github.com/gradle/actions/pull/737">gradle/actions#737</a></li>
<li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in
/.github/workflow-samples/kotlin-dsl by <a
href="https://github.com/bot-githubaction"><code>@​bot-githubaction</code></a>
in <a
href="https://redirect.github.com/gradle/actions/pull/742">gradle/actions#742</a></li>
<li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in
/.github/workflow-samples/java-toolchain by <a
href="https://github.com/bot-githubaction"><code>@​bot-githubaction</code></a>
in <a
href="https://redirect.github.com/gradle/actions/pull/741">gradle/actions#741</a></li>
<li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in
/.github/workflow-samples/groovy-dsl by <a
href="https://github.com/bot-githubaction"><code>@​bot-githubaction</code></a>
in <a
href="https://redirect.github.com/gradle/actions/pull/740">gradle/actions#740</a></li>
<li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in
/.github/workflow-samples/gradle-plugin by <a
href="https://github.com/bot-githubaction"><code>@​bot-githubaction</code></a>
in <a
href="https://redirect.github.com/gradle/actions/pull/739">gradle/actions#739</a></li>
<li>Bump Gradle Wrapper from 9.0.0 to 9.1.0 in
/sources/test/init-scripts by <a
href="https://github.com/bot-githubaction"><code>@​bot-githubaction</code></a>
in <a
href="https://redirect.github.com/gradle/actions/pull/738">gradle/actions#738</a></li>
<li>Update known wrapper checksums by <a
href="https://github.com/github-actions"><code>@​github-actions</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/743">gradle/actions#743</a></li>
<li>Bump com.google.guava:guava from 33.4.8-jre to 33.5.0-jre in
/.github/workflow-samples/kotlin-dsl in the gradle group across 1
directory by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/746">gradle/actions#746</a></li>
<li>Bump the npm-dependencies group in /sources with 5 updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/745">gradle/actions#745</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/gradle/actions/compare/v4...v4.4.4">https://github.com/gradle/actions/compare/v4...v4.4.4</a></p>
<h2>v4.4.3</h2>
<h2>What's Changed</h2>
<ul>
<li>Adapt tests to future new Build Scan publication message by <a
href="https://github.com/alextu"><code>@​alextu</code></a> in <a
href="https://redirect.github.com/gradle/actions/pull/708">gradle/actions#708</a></li>
<li>Add missing Gradle version input to setup-gradle by <a
href="https://github.com/jprinet"><code>@​jprinet</code></a> in <a
href="https://redirect.github.com/gradle/actions/pull/713">gradle/actions#713</a></li>
<li>Bump the github-actions group across 2 directories with 4 updates by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/710">gradle/actions#710</a></li>
<li>Bump references to Develocity Gradle plugin from 4.1 to 4.1.1 by <a
href="https://github.com/bot-githubaction"><code>@​bot-githubaction</code></a>
in <a
href="https://redirect.github.com/gradle/actions/pull/712">gradle/actions#712</a></li>
<li>Update known wrapper checksums by <a
href="https://github.com/github-actions"><code>@​github-actions</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/709">gradle/actions#709</a></li>
<li>Bump the npm-dependencies group across 1 directory with 4 updates by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/711">gradle/actions#711</a></li>
<li>Do not run setup-gradle post action if workflow is cancelled by <a
href="https://github.com/jprinet"><code>@​jprinet</code></a> in <a
href="https://redirect.github.com/gradle/actions/pull/716">gradle/actions#716</a></li>
<li>Bump the github-actions group across 2 directories with 2 updates by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/715">gradle/actions#715</a></li>
<li>Bump the npm-dependencies group across 1 directory with 3 updates by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/720">gradle/actions#720</a></li>
<li>Bump github/codeql-action from 3.29.11 to 3.30.0 in the
github-actions group across 1 directory by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/719">gradle/actions#719</a></li>
<li>Bump com.fasterxml.jackson.dataformat:jackson-dataformat-smile from
2.19.2 to 2.20.0 in /sources/test/init-scripts in the gradle group
across 1 directory by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/718">gradle/actions#718</a></li>
<li>Update known wrapper checksums by <a
href="https://github.com/github-actions"><code>@​github-actions</code></a>[bot]
in <a
href="https://redirect.github.com/gradle/actions/pull/723">gradle/actions#723</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/gradle/actions/commit/4d9f0ba0025fe599b4ebab900eb7f3a1d93ef4c2"><code>4d9f0ba</code></a>
Bump the github-actions group across 1 directory with 2 updates (<a
href="https://redirect.github.com/gradle/actions/issues/748">#748</a>)</li>
<li><a
href="https://github.com/gradle/actions/commit/4b530e369bfef1ac8fc2160ec97b9fda1ccd9901"><code>4b530e3</code></a>
Bump the github-actions group across 1 directory with 2 updates</li>
<li><a
href="https://github.com/gradle/actions/commit/e60655a8a03bf3b9a7ff400dc5ef49bed725bec8"><code>e60655a</code></a>
Upgrade to node 24 (<a
href="https://redirect.github.com/gradle/actions/issues/721">#721</a>)</li>
<li>See full diff in <a
href="https://github.com/gradle/actions/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=gradle/actions&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
### Description
Enable handling the a few new EP options.
### Description
Move packaging pipelines' macOS build jobs to a new machine pool that is
much faster, which can reduce the build time by 85%.
However, it has some drawbacks:
1. It does not have Java installed. Also, ADO's JavaToolInstaller task
does not support macOS. So I have to remove these things from the
pipeline. We will stop providing java binaries for macOS until this
issue is resolved.
2. It does not Cocoapod. But Cocoapod itself is being deprecated. So it
is not a big deal.
The `check_emulator_running_using_avd_name` function should not assume
adb is in PATH. This PR fixes the issue.
### Description
Add -Wno-deprecated-declarations when compiling unit test with TRT/NV EP
Support


### Motivation and Context
Fixes source build for these EPs.

Signed-off-by: Kevin Chen <[email protected]>
…osoft#26230)

Users with RTX 5090 GPUs are experiencing runtime errors when using
onnxruntime-gpu:
```
[ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Slice node. 
Name:'Slice_34' Status Message: CUDA error cudaErrorNoKernelImageForDevice:
no kernel image is available for execution on the device
```
This occurs because RTX 5090 uses CUDA compute architecture 12.0 (SM
12.0). The incompatibility of `onnxruntime-gpu` 1.23 was built with
`90a-virtual`. The `90a` architecture is a specialized,
non-forward-compatible version of the Hopper architecture, making it
incompatible with future GPU generations like Blackwell.

This change will revert `90a-virtual` back to `90-virtual` as used in
1.22. This shall bring back the compatibility in Blackwell GPU.

The FPA_INTB_GEMM is disabled by default. It need some extra work to
make it compatible with 90-virtual and no 90a-real use case.

Related:
microsoft#26002
microsoft#26226
microsoft#26181
…26222)

This pull request introduces a minor update to the test infrastructure
and test files, focusing on improving build configuration and test
utility usage.

Build configuration:

* Updated `cmake/onnxruntime_test_pch.cmake` to only enable precompiled
headers for test targets when not performing a minimal build, preventing
unnecessary PCH usage in minimal build scenarios.

Test utility improvements:

* Added an include for `asserts.h` in
`onnxruntime/test/platform/file_io_test.cc`, ensuring test assertions
are available and improving test reliability.

---------

Co-authored-by: Copilot <[email protected]>
…soft#26253)

### Description
<!-- Describe your changes. -->

Add check for ARM64 SME to MlasDynamicQGemmBatch() unit tests. Now,
ARM64 SME is required or else MlasDynamicQGemmBatch() is a no-op.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Fix unit test failures.
…26155)

### Description
<!-- Describe your changes. -->

Update Linux device discovery to add rudimentary detection of whether a
GPU is discrete and include that in the device metadata.

Now, an Nvidia GPU is considered to be discrete. We can refine this later.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Include more potentially useful metadata.
### Description
Update the version number in the main branch to 1.24


### Motivation and Context
To replace PR microsoft#25545 . The old one stuck on CLA check.

---------

Co-authored-by: vraspar <[email protected]>
### Description
Upgrade torch version to address component governance alert
…BufferBindingSize limit (microsoft#25962)

### Description
When an input is bigger than maxStorageBufferBindingSize, use multiple
binding entries for it. We refine the implement for
`getByOffset`/`setByOffset` so that let's say, `input_b` is 257MB, but
maxStorageBufferBindingSize is 256MB, we can use `b.getByOffset(offset)`
to get the correct element and no need to care about the different
binding entry. Actually, it will generate shader code like this.
```
var<storage, read> input_b: array<vec4<u32>>; // [0, 256MB) of input_b
var<storage, read> input_b1: array<vec4<u32>>; // [256MB, 257MB) of input_b
```

### Motivation and Context
QC's maxStorageBufferBindingSize is 256MB, which is not enough for phi-4
model. So for QC, we customized a new phi4 model which use `slice` op to
split the big matrix. That means we need to keep two different phi4
model for different platform.

### For reviewers
The core logic is located
- Shader side:
- `shader_helper.cc`. In shader, use more`@group(0) @binding(....`
matched the actual buffer numbers.
- `shader_variable.cc`. Implement `set_xxx_by_offset(global_offset,
value)` and `get_xxx_by_offset(global_offset)` shader helper function,
which will be used when using `setByOffset`/`getByOffset` and the input
exceed the maxstoragebuffersize.
- WebGPU API side:
- `webgpu_context.cc`. In WebGPU API, use more group entry matched the
actual buffer numbers.
### Description
When there is an optional input (empty input type) in the
OrtShapeInferContext construction, use undefined data type and empty
shape as a placeholder.

### Motivation and Context
VitisAI EP may add nodes with optional inputs during graph optimization
to meet the requirements of AMD AI compilers.

This fix may help other execution providers to improve the graph
optimization process.
…oft#26227)

Updated enforcement conditions to allow any dimensional tensor with a
single element according to the ONNX spec.

Fix microsoft#26218 Fix
microsoft#26265

---------

Signed-off-by: Justin Chu <[email protected]>
### Description
Enable handling the following EP options:

- preferredLayout
- forceCpuNodeNames
- device
- enableGraphCapture (this is set in session options, not EP options.
but it is eventually passed to ORT as EP options.)
@Jaswanth51 Jaswanth51 requested a review from ankitm3k October 10, 2025 03:06
@ankitm3k ankitm3k merged commit 2652479 into ovep-develop Oct 10, 2025
6 of 8 checks passed
@ankitm3k ankitm3k deleted the sync_msft_10102025 branch October 10, 2025 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.