Skip to content

Commit dde9e20

Browse files
committed
Updated feature description
1 parent 0ec2dc7 commit dde9e20

File tree

1 file changed

+8
-31
lines changed

1 file changed

+8
-31
lines changed

docs/execution-providers/OpenVINO-ExecutionProvider.md

Lines changed: 8 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -174,7 +174,7 @@ OpenVINO™ Execution Provider for ONNX Runtime allows multiple stream execution
174174
175175
### Auto-Device Execution for OpenVINO EP
176176
177-
Use `AUTO:<device 1><device 2>..` as the device name to delegate selection of an actual accelerator to OpenVINO™. Auto-device internally recognizes and selects devices from CPU, integrated GPU, discrete Intel GPUs (when available) and NPU (when available) depending on the device capabilities and the characteristic of CNN models, for example, precisions. Then Auto-device assigns inference requests to the selected device.
177+
Use `AUTO:<device 1>,<device 2>..` as the device name to delegate selection of an actual accelerator to OpenVINO™. Auto-device internally recognizes and selects devices from CPU, integrated GPU, discrete Intel GPUs (when available) and NPU (when available) depending on the device capabilities and the characteristic of CNN models, for example, precisions. Then Auto-device assigns inference requests to the selected device.
178178
179179
From the application point of view, this is just another device that handles all accelerators in full system.
180180
@@ -242,34 +242,11 @@ where "DEVICE_KEY" can be CPU, NPU or GPU , "PROPERTY" must be a valid entity de
242242
243243
Exception during initialization: [json.exception.type_error.302] type must be string, but is a number.
244244
245-
While one can set the int/bool values like this "NPU_TILES": "2" which is valid (refer to the example given below).
245+
While one can set the int/bool values like this "NPU_TILES": "2" which is valid.
246246
If someone passes incorrect keys, it will be skipped with a warning while incorrect values assigned to a valid key will result in an exception arising from OV framework.
247247
248248
The valid properties are of 2 types viz. MUTABLE (R/W) & IMMUTABLE (R ONLY) these are also governed while setting the same. If an IMMUTABLE property is being set, we skip setting the same with a similar warning.
249249
250-
Example:
251-
252-
The usage of this functionality using onnxruntime_perf_test application is as below –
253-
254-
```
255-
onnxruntime_perf_test.exe -e openvino -m times -r 1 -i "device_type|NPU load_config|npu_config.json" model.onnx
256-
```
257-
where the npu_config.json file is defined as below –
258-
259-
```bash
260-
{
261-
"NPU": {
262-
"PERFORMANCE_HINT": "THROUGHPUT",
263-
"WORKLOAD_TYPE": "Efficient",
264-
"NPU_TILES": "2",
265-
"LOG_LEVEL": "LOG_DEBUG",
266-
"NPU_COMPILATION_MODE_PARAMS": "enable-weights-swizzling=false enable-activation-swizzling=false enable-grouped-matmul=false"
267-
}
268-
}
269-
270-
```
271-
To explicitly enable logs one must use "LOG_LEVEL": "LOG_DEBUG" in the JSON device configuration property. The log verifies that the correct device parameters and properties are being set / populated during runtime with OVEP.
272-
273250
### OpenVINO Execution Provider Supports EP-Weight Sharing across sessions
274251
The OpenVINO Execution Provider (OVEP) in ONNX Runtime supports EP-Weight Sharing, enabling models to efficiently share weights across multiple inference sessions. This feature enhances the execution of Large Language Models (LLMs) with prefill and KV cache, reducing memory consumption and improving performance when running multiple inferences.
275252
@@ -278,11 +255,11 @@ With EP-Weight Sharing, prefill and KV cache models can now reuse the same set o
278255
These changes enable weight sharing between two models using the session context option: ep.share_ep_contexts.
279256
Refer to [Session Options](https://github.com/microsoft/onnxruntime/blob/5068ab9b190c549b546241aa7ffbe5007868f595/include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h#L319) for more details on configuring this runtime option.
280257
281-
### OVEP supports CreateSessionFromArray API
282-
The OpenVINO Execution Provider (OVEP) in ONNX Runtime supports creating sessions from memory using the CreateSessionFromArray API. This allows loading models directly from memory buffers instead of file paths. The CreateSessionFromArray loads the model in memory then creates a session from the in-memory byte array.
258+
### OVEP supports CreateSessionFromArray API
259+
The OpenVINO Execution Provider (OVEP) in ONNX Runtime supports creating sessions from memory using the CreateSessionFromArray API. This allows loading models directly from memory buffers instead of file paths. The CreateSessionFromArray loads the model in memory then creates a session from the in-memory byte array.
283260
284-
Note:
285-
Use the -l argument when running the inference with perf_test using CreateSessionFromArray API.
261+
Note:
262+
Use the -l argument when running the inference with perf_test using CreateSessionFromArray API.
286263
287264
## Configuration Options
288265
@@ -360,8 +337,8 @@ The following table lists all the available configuration options for API 2.0 an
360337
361338
362339
Valid Hetero or Multi or Auto Device combinations:
363-
HETERO:<DEVICE_TYPE_1>,<DEVICE_TYPE_2>,<DEVICE_TYPE_3>...
364-
The <DEVICE_TYPE> can be any of these devices from this list ['CPU','GPU', 'NPU']
340+
`HETERO:<device 1>,<device 2>...`
341+
The `device` can be any of these devices from this list ['CPU','GPU', 'NPU']
365342
366343
A minimum of two DEVICE_TYPE'S should be specified for a valid HETERO, MULTI, or AUTO Device Build.
367344

0 commit comments

Comments
 (0)