Skip to content

Commit 2dd7182

Browse files
authored
[TensorRT EP] Update docs for ORT 1.21 & latest TRT (microsoft#23995)
### Description <!-- Describe your changes. --> * Update version support matrix * Add note for ORT 1.21 oss parser users Preview this change: https://yf711.github.io/onnxruntime/docs/build/eps.html#tensorrt https://yf711.github.io/onnxruntime/docs/execution-providers/TensorRT-ExecutionProvider.html#requirements ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
1 parent a160eed commit 2dd7182

File tree

2 files changed

+43
-34
lines changed

2 files changed

+43
-34
lines changed

docs/build/eps.md

Lines changed: 23 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -110,36 +110,48 @@ See more information on the TensorRT Execution Provider [here](../execution-prov
110110

111111
* Follow [instructions for CUDA execution provider](#cuda) to install CUDA and cuDNN, and setup environment variables.
112112
* Follow [instructions for installing TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/latest/installing-tensorrt/installing.html)
113-
* The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 10.0.
113+
* The TensorRT execution provider for ONNX Runtime is built and tested with TensorRT 10.8.
114114
* The path to TensorRT installation must be provided via the `--tensorrt_home` parameter.
115-
* ONNX Runtime uses TensorRT built-in parser from `tensorrt_home` by default.
115+
* ONNX Runtime uses [TensorRT built-in parser](https://developer.nvidia.com/tensorrt/download) from `tensorrt_home` by default.
116116
* To use open-sourced [onnx-tensorrt](https://github.com/onnx/onnx-tensorrt/tree/main) parser instead, add `--use_tensorrt_oss_parser` parameter in build commands below.
117-
* The default version of open-sourced onnx-tensorrt parser is encoded in [cmake/deps.txt](https://github.com/microsoft/onnxruntime/blob/main/cmake/deps.txt).
117+
* The default version of open-sourced onnx-tensorrt parser is specified in [cmake/deps.txt](https://github.com/microsoft/onnxruntime/blob/main/cmake/deps.txt).
118118
* To specify a different version of onnx-tensorrt parser:
119119
* Select the commit of [onnx-tensorrt](https://github.com/onnx/onnx-tensorrt/commits) that you preferred;
120120
* Run `sha1sum` command with downloaded onnx-tensorrt zip file to acquire the SHA1 hash
121121
* Update [cmake/deps.txt](https://github.com/microsoft/onnxruntime/blob/main/cmake/deps.txt) with updated onnx-tensorrt commit and hash info.
122+
* Please make sure TensorRT built-in parser/open-sourced onnx-tensorrt specified in [cmake/deps.txt](https://github.com/microsoft/onnxruntime/blob/main/cmake/deps.txt) are **version-matched**, if enabling `--use_tensorrt_oss_parser`.
123+
* i.e It's version-matched if assigning `tensorrt_home` with path to TensorRT-10.9 built-in binaries and onnx-tensorrt [10.9-GA branch](https://github.com/onnx/onnx-tensorrt/tree/release/10.9-GA) specified in [cmake/deps.txt](https://github.com/microsoft/onnxruntime/blob/main/cmake/deps.txt).
124+
125+
126+
### **[Note to ORT 1.21.0 open-sourced parser users]**
127+
128+
* ORT 1.21.0 links against onnx-tensorrt 10.8-GA, which requires upcoming onnx 1.18.
129+
* Here's a temporarily fix to preview on onnx-tensorrt 10.8-GA (or newer) when building ORT 1.21.0:
130+
* Replace the [onnx line in cmake/deps.txt](https://github.com/microsoft/onnxruntime/blob/rel-1.21.0/cmake/deps.txt#L38)
131+
with `onnx;https://github.com/onnx/onnx/archive/f22a2ad78c9b8f3bd2bb402bfce2b0079570ecb6.zip;324a781c31e30306e30baff0ed7fe347b10f8e3c`
132+
* Download [this](https://github.com/microsoft/onnxruntime/blob/7b2733a526c12b5ef4475edd47fd9997ebc2b2c6/cmake/patches/onnx/onnx.patch) as raw file and save file to [cmake/patches/onnx/onnx.patch](https://github.com/microsoft/onnxruntime/blob/rel-1.21.0/cmake/patches/onnx/onnx.patch) (do not copy/paste from browser, as it might alter line break type)
133+
* Build ORT 1.21.0 with trt-related flags above (including `--use_tensorrt_oss_parser`)
122134

123135
### Build Instructions
124136
{: .no_toc }
125137

126138
#### Windows
127139
```bash
128140
# to build with tensorrt built-in parser
129-
.\build.bat --cudnn_home <path to cuDNN home> --cuda_home <path to CUDA home> --use_tensorrt --tensorrt_home <path to TensorRT home> --cmake_generator "Visual Studio 17 2022"
141+
.\build.bat --config Release --parallel --cmake_extra_defines 'CMAKE_CUDA_ARCHITECTURES=native' --cudnn_home <path to cuDNN home> --cuda_home <path to CUDA home> --use_tensorrt --tensorrt_home <path to TensorRT home> --cmake_generator "Visual Studio 17 2022"
130142

131143
# to build with specific version of open-sourced onnx-tensorrt parser configured in cmake/deps.txt
132-
.\build.bat --cudnn_home <path to cuDNN home> --cuda_home <path to CUDA home> --use_tensorrt --tensorrt_home <path to TensorRT home> --use_tensorrt_oss_parser --cmake_generator "Visual Studio 17 2022"
144+
.\build.bat --config Release --parallel --cmake_extra_defines 'CMAKE_CUDA_ARCHITECTURES=native' --cudnn_home <path to cuDNN home> --cuda_home <path to CUDA home> --use_tensorrt --tensorrt_home <path to TensorRT home> --use_tensorrt_oss_parser --cmake_generator "Visual Studio 17 2022"
133145
```
134146

135147
#### Linux
136148

137149
```bash
138150
# to build with tensorrt built-in parser
139-
./build.sh --cudnn_home <path to cuDNN e.g. /usr/lib/x86_64-linux-gnu/> --cuda_home <path to folder for CUDA e.g. /usr/local/cuda> --use_tensorrt --tensorrt_home <path to TensorRT home>
151+
./build.sh --config Release --parallel --cmake_extra_defines 'CMAKE_CUDA_ARCHITECTURES=native' --cudnn_home <path to cuDNN e.g. /usr/lib/x86_64-linux-gnu/> --cuda_home <path to folder for CUDA e.g. /usr/local/cuda> --use_tensorrt --tensorrt_home <path to TensorRT home>
140152

141153
# to build with specific version of open-sourced onnx-tensorrt parser configured in cmake/deps.txt
142-
./build.sh --cudnn_home <path to cuDNN e.g. /usr/lib/x86_64-linux-gnu/> --cuda_home <path to folder for CUDA e.g. /usr/local/cuda> --use_tensorrt --use_tensorrt_oss_parser --tensorrt_home <path to TensorRT home> --skip_submodule_sync
154+
./build.sh --config Release --parallel --cmake_extra_defines 'CMAKE_CUDA_ARCHITECTURES=native' --cudnn_home <path to cuDNN e.g. /usr/lib/x86_64-linux-gnu/> --cuda_home <path to folder for CUDA e.g. /usr/local/cuda> --use_tensorrt --use_tensorrt_oss_parser --tensorrt_home <path to TensorRT home> --skip_submodule_sync
143155
```
144156

145157
Dockerfile instructions are available [here](https://github.com/microsoft/onnxruntime/tree/main/dockerfiles#tensorrt)
@@ -164,7 +176,7 @@ These instructions are for the latest [JetPack SDK](https://developer.nvidia.com
164176
2. Specify the CUDA compiler, or add its location to the PATH.
165177

166178
1. JetPack 5.x users can upgrade to the latest CUDA release without updating the JetPack version or Jetson Linux BSP (Board Support Package).
167-
179+
168180
1. For JetPack 5.x users, CUDA>=11.8 and GCC>9.4 are required to be installed on and after ONNX Runtime 1.17.
169181

170182
2. Check [this official blog](https://developer.nvidia.com/blog/simplifying-cuda-upgrades-for-nvidia-jetson-users/) for CUDA upgrade instruction (CUDA 12.2 has been verified on JetPack 5.1.2 on Jetson Xavier NX).
@@ -198,14 +210,10 @@ These instructions are for the latest [JetPack SDK](https://developer.nvidia.com
198210
```bash
199211
sudo apt install -y --no-install-recommends \
200212
build-essential software-properties-common libopenblas-dev \
201-
libpython3.8-dev python3-pip python3-dev python3-setuptools python3-wheel
213+
libpython3.10-dev python3-pip python3-dev python3-setuptools python3-wheel
202214
```
203215

204-
4. Cmake is needed to build ONNX Runtime. The minimum required CMake version is 3.26. This can be either installed by:
205-
206-
1. (Unix/Linux) Build from source. Download sources from [https://cmake.org/download/](https://cmake.org/download/)
207-
and follow [https://cmake.org/install/](https://cmake.org/install/) to build from source.
208-
2. (Ubuntu) Install deb package via apt repository: e.g [https://apt.kitware.com/](https://apt.kitware.com/)
216+
4. Cmake is needed to build ONNX Runtime. Please check the minimum required CMake version [here](https://github.com/microsoft/onnxruntime/blob/main/cmake/CMakeLists.txt#L6). Download from https://cmake.org/download/ and add cmake executable to `PATH` to use it.
209217

210218
5. Build the ONNX Runtime Python wheel:
211219

@@ -221,7 +229,7 @@ These instructions are for the latest [JetPack SDK](https://developer.nvidia.com
221229

222230
* By default, `onnxruntime-gpu` wheel file will be captured under `path_to/onnxruntime/build/Linux/Release/dist/` (build path can be customized by adding `--build_dir` followed by a customized path to the build command above).
223231

224-
* Append `--skip_tests --cmake_extra_defines 'CMAKE_CUDA_ARCHITECTURES=72;87' 'onnxruntime_BUILD_UNIT_TESTS=OFF' 'onnxruntime_USE_FLASH_ATTENTION=OFF'
232+
* Append `--skip_tests --cmake_extra_defines 'CMAKE_CUDA_ARCHITECTURES=native' 'onnxruntime_BUILD_UNIT_TESTS=OFF' 'onnxruntime_USE_FLASH_ATTENTION=OFF'
225233
'onnxruntime_USE_MEMORY_EFFICIENT_ATTENTION=OFF'` to the build command to opt out optional features and reduce build time.
226234

227235
* For a portion of Jetson devices like the Xavier series, higher power mode involves more cores (up to 6) to compute but it consumes more resource when building ONNX Runtime. Set `--parallel 1` in the build command if OOM happens and system is hanging.

docs/execution-providers/TensorRT-ExecutionProvider.md

Lines changed: 20 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -29,24 +29,25 @@ See [Build instructions](../build/eps.md#tensorrt).
2929

3030
Note: Starting with version 1.19, **CUDA 12** becomes the default version when distributing ONNX Runtime GPU packages.
3131

32-
| ONNX Runtime | TensorRT | CUDA |
33-
| :----------- | :------- | :------------- |
34-
| main | 10.5 | **12.x**, 11.8 |
35-
| 1.20 | 10.5 | **12.x**, 11.8 |
36-
| 1.19 | 10.2 | **12.x**, 11.8 |
37-
| 1.18 | 10.0 | 11.8, 12.x |
38-
| 1.17 | 8.6 | 11.8, 12.x |
39-
| 1.16 | 8.6 | 11.8 |
40-
| 1.15 | 8.6 | 11.8 |
41-
| 1.14 | 8.5 | 11.6 |
42-
| 1.12-1.13 | 8.4 | 11.4 |
43-
| 1.11 | 8.2 | 11.4 |
44-
| 1.10 | 8.0 | 11.4 |
45-
| 1.9 | 8.0 | 11.4 |
46-
| 1.7-1.8 | 7.2 | 11.0.3 |
47-
| 1.5-1.6 | 7.1 | 10.2 |
48-
| 1.2-1.4 | 7.0 | 10.1 |
49-
| 1.0-1.1 | 6.0 | 10.0 |
32+
| ONNX Runtime | TensorRT | CUDA |
33+
| :----------- | :------- | :------------------ |
34+
| main | 10.9 | **12.0-12.8**, 11.8 |
35+
| 1.21 | 10.8 | **12.0-12.8**, 11.8 |
36+
| 1.20 | 10.4 | **12.0-12.6**, 11.8 |
37+
| 1.19 | 10.2 | **12.0-12.6**, 11.8 |
38+
| 1.18 | 10.0 | 11.8, 12.0-12.6 |
39+
| 1.17 | 8.6 | 11.8, 12.0-12.6 |
40+
| 1.16 | 8.6 | 11.8 |
41+
| 1.15 | 8.6 | 11.8 |
42+
| 1.14 | 8.5 | 11.6 |
43+
| 1.12-1.13 | 8.4 | 11.4 |
44+
| 1.11 | 8.2 | 11.4 |
45+
| 1.10 | 8.0 | 11.4 |
46+
| 1.9 | 8.0 | 11.4 |
47+
| 1.7-1.8 | 7.2 | 11.0.3 |
48+
| 1.5-1.6 | 7.1 | 10.2 |
49+
| 1.2-1.4 | 7.0 | 10.1 |
50+
| 1.0-1.1 | 6.0 | 10.0 |
5051

5152
For more details on CUDA/cuDNN versions, please see [CUDA EP requirements](./CUDA-ExecutionProvider.md#requirements).
5253

@@ -265,7 +266,7 @@ TensorRT configurations can be set by execution provider options. It's useful wh
265266
assert options["TensorrtExecutionProvider"].get("has_user_compute_stream", "") == "1"
266267
...
267268
```
268-
269+
269270
</Details>
270271
271272
* To take advantage of user compute stream, it is recommended to use [I/O Binding](https://onnxruntime.ai/docs/api/python/api_summary.html#data-on-device) to bind inputs and outputs to tensors in device.

0 commit comments

Comments
 (0)