PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
This is the second beta release of TRTorch, targeting PyTorch 1.7.x, CUDA 11.0 (on x86_64), TensorRT 7.2 and cuDNN 8. TRTorch 0.2.0 for aarch64 targets JetPack 4.5.x. It updates the to_backend
integration for PyTorch to reflect changes in the PyTorch API. A new API has been added to disable the newly introduced TF32 data format used on Ampere as TF32 is now the default FP32 format used in TRTorch. APIs have been solidified for runtime configuration of the active CUDA device to let users choose what device a program is deserialized on. This API will continue to change as we further define the serialization format and work with the PyTorch team to make runtime device configuration more ergonomic. You can follow this work here: https://github.com/NVIDIA/TRTorch/discussions/311. This PR also formalizes DLA support in TRTorch, adding APIs and capabilities to target DLA on Jetson and DRIVE platforms. v0.2.0 also includes a new shared library libtrtorchrt.so
. This library only contains the runtime components of TRTorch and is suitable for use in situations where device footprint is extremely limited. libtrtorch.so
can be linked to C++ applications and loaded into Python scripts and will load all necessary trtorch runtime components into the torch runtime allowing users to run TRTorch applications without the full compiler. v0.2.0 also adds support for Python 3.9.
- Bazel 4.0.0
- Libtorch 1.7.1 (on x86_64), 1.7.0 (on aarch64)
- CUDA 11.0 (by default, newer CUDA 11 supported with compatible PyTorch build)
- cuDNN 8.0.5
- TensorRT 7.2.2
Signed-off-by: Naren Dasan [email protected] Signed-off-by: Naren Dasan [email protected]
This is the first "beta" release of TRTorch, introducing direct integration into PyTorch via the new Backend API. This release also contains an NGC based Dockerfile for users looking to use TRTorch on Ampere, using NGC's patched version of PyTorch. Note that compiled programs from older versions of TRTorch are not compatible with the TRTorch 0.1.0 runtime due to an ABI change. There are now example Jupyter notebooks which demonstrate various features of the compiler included in the documentation.
New Ops:
added some fixes, trt/jit output still mismatches (723ac1d)
added test cases to explicitly check hidden/cell state outputs (d7c3164)
cleaned up logic, added case where bias doesn't exist for LSTM cell converter (a3e1093)
//core/conversion/evaluator: Custom to IValue that handles int[] (68c934a)
//docker: Workaround only shared libraries being available in (50c7eda)
//py: Fix long description section of setup.py (efd2099)
//tests: Add stride to complete tensors (af5d28e)
//tests/accuracy: Fix int8 accuracy test for new PTQ api (a53bea7)
//tests/core/converters/activations: Complete tensors in prelu test (0e90f78)
docsrc: Update docsrc container for bazel 3.4.1 (4eb53b5)
fix(Windows)!: Fix dependency resolution for local builds (858d8c3)
chore!: Update dependencies to PyTorch 1.6.0 (8eda27d)
chore!: Bumping version numbers to 0.1.0 (b84c90b)
refactor(//core)!: Introducing a binding convention that will address (5a105c6)
refactor!: Renaming extra info to compile spec to be more consistent (b8fa228)
Signed-off-by: Naren Dasan [email protected] Signed-off-by: Naren Dasan [email protected]
Signed-off-by: Naren Dasan [email protected] Signed-off-by: Naren Dasan [email protected]
Signed-off-by: Naren Dasan [email protected] Signed-off-by: Naren Dasan [email protected]
Signed-off-by: Naren Dasan [email protected] Signed-off-by: Naren Dasan [email protected]
Signed-off-by: Naren Dasan [email protected] Signed-off-by: Naren Dasan [email protected]
This is the thrid alpha release of TRTorch. It bumps the target PyTorch version to 1.5.1 and introduces support for cuDNN 8.0 and TensorRT 7.1, however this is only supported in cases where PyTorch has been compiled with the same cuDNN version. This release also introduces formal support for aarch64, however pre-compiled binaries will not be available until we can deliver python packages for aarch64 for all supported version of python. Note some idiosyncrasies when it comes to working with PyTorch on aarch64, if you are using PyTorch compiled by NVIDIA for aarch64 the ABI version is CXX11 instead of the pre CXX11 ABI found on PyTorch on x86_64. When compiling the Python API for TRTorch add the --use-cxx11-abi
flag to the command and do not use the --config=pre-cxx11-abi
flag when building the C++ library (more instructions on native aarch64 compilation in the documentation). This release also introduces a breaking change to the C++ API where now in order to use logging or ptq APIs a separate header file must be included. Look at the implementation of trtorchc
or ptq
for example usage.
Documentation on how to install bazel is added as well to support aarch64 until bazel releases binaries for the platform (which is soon)
Signed-off-by: Naren Dasan [email protected] Signed-off-by: Naren Dasan [email protected]
Signed-off-by: Naren Dasan [email protected] Signed-off-by: Naren Dasan [email protected]