This code contains:
- A MLIR dialect
that attempts to precisely model the TensorRT operator set.
It provides an MLIR dialect, static verification and type inference, optimizations,
translations to TensorRT by invoking the TensorRT builder API (
nvinfer1::INetworkBuilder
), and translations to C++ calls to the builder API. - Conversions from stablehlo to the TensorRT dialect.
- An example compiler infrastructure for building a compiler and runtime that offloads complex sub-programs to TensorRT, complete with support for (bounded) dynamic shapes and a Python interface.
Note that the TensorRT dialect is under the top-level 'tensorrt' folder and can be built as an independent project in case the other features are not needed for your use-case.
We currently support only building on Linux x86 systems.
We support building several different ways (only via CMake) depending on use-case.
In each case, the LLVM-Project version that we are currently aligned to is
given in build_tools/cmake/LLVMCommit.txt
.
Note that currently we provide an LLVM patch which essentially cherry-picks the bug fixes from this open MLIR PR.
- Build as a Standalone Project with LLVM downloaded by CMake
- Build as a Standalone Project with LLVM provided by User
- Build as a sub-project of a larger build (e.g.
add_subdirectory
) - Build via LLVM-External-Projects mechanism
Here we only show how to do Option 1 and Option 2.
This is the simplest way to get started and incurs low download overhead since we download LLVM-Project as a zip archive directly from GitHub at our pinned commit.
# See CMakePresets.json for convenient CMake presets.
# Preset 'ninja-llvm' uses the Ninja generator, clang, and
# LLD, but GNU GCC toolchain is also supported (use preset
# ninja-gcc).
#
# By default, the CMake build system will download a version
# of TensorRT for you.
cmake --preset ninja-llvm
# Example build commands:
# Build everything
ninja -C build all
# Build and run tests
ninja -C build check-mlir-executor
ninja -C build check-mlir-tensorrt-dialect
ninja -C build check-mlir-tensorrt
# Build wheels (output in `build/wheels`)
ninja -C build mlir-tensorrt-all-wheels
This is more complex but lets you "bring your own LLVM-Project" source code or binary.
- Build MLIR
# Clone llvm-project
git clone https://github.com/llvm/llvm-project.git llvm-project
# Checkout the right commit. Of course, you may try
# a newer commit or your own modified LLVM-Project.
cd llvm-project
git checkout $(cat build_tools/cmake/LLVMCommit.cmake | grep -Po '(?<=").*(?=")')
# Apply patch from llvm-project PR 91524
git apply ../build_tools/llvm-project.patch
# Do the build
cd ..
./build_tools/scripts/build_mlir.sh llvm-project build/llvm-project
- Build the project and run all tests
cmake -B ./build/mlir-tensorrt -S . -G Ninja \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ \
-DMLIR_TRT_USE_LINKER=lld \
-DMLIR_TRT_PACKAGE_CACHE_DIR=${PWD}/.cache.cpm \
-DMLIR_DIR=build/llvm-project/lib/cmake/mlir \
-DCMAKE_PLATFORM_NO_VERSIONED_SONAME=ON
ninja -C build/mlir-tensorrt all
ninja -C build/mlir-tensorrt check-mlir-executor
ninja -C build/mlir-tensorrt check-mlir-tensorrt-dialect
ninja -C build/mlir-tensorrt check-mlir-tensorrt
- Build Python binding wheels
This will produce wheels under build/mlir-tensorrt/wheels
:
ninja -C build/mlir-tensorrt mlir-tensorrt-all-wheels
Our CMake-based build system will by default attempt to download a
version of TensorRT to use during building and testing. This is controlled
by the CMake cache variable MLIR_TRT_DOWNLOAD_TENSORRT_VERSION
.
To instead use a local TensorRT version, simply set the CMake
cache variable MLIR_TRT_TENSORRT_DIR
to the
path to the TensorRT installation directory (containing directories include
, lib64
, and
so on), and set MLIR_TRT_DOWNLOAD_TENSORRT_VERSION
to the empty string.
These variables are fed into the CMake function find_tesnsorrt
which is invoked
here. Options INSTALL_DIR
and DOWNLOAD_VERSION
are
mutually exclusive.
All executables built by the project will link TensorRT dynamically and load it
dynamically at runtime using the runtime environment's default dynamic library
search mechanism. The LIT testing configurations used in the project set the
dynamic library search path (e.g. the environment variable LD_LIBRARY_PATH
on
Linux systems) to ensure that the TensorRT version used during compilation is
also used during testing.
When invoking an executable (e.g. mlir-tensorrt-opt
) directly outside of the
LIT test runner, one should set the appropriate environment variables (e.g.
LD_LIBRARY_PATH
) to point to the TensorRT library which should be loaded at runtime.
In general, if the project is compiled with TensorRT X.Y
but version
X.Z
is loaded at runtime, with Z > Y
, the software is expected to work, but
no guarantees are currently made.