Skip to content

Latest commit

 

History

History
103 lines (93 loc) · 8.17 KB

README.md

File metadata and controls

103 lines (93 loc) · 8.17 KB

CUDA and TensorRT Starter Workspace

This repository guides freshmen who does not have background of parallel programming in C++ to learn CUDA and TensorRT from the beginning.

This repository is still working in progress(~24/02/21). I will add some more samples and more detailed description in the future. Please feel free to contribute to this repository

How to install

Please pull the repository firstly

git clone [email protected]:kalfazed/tensorrt_starter.git

After clone the repository, please modify the opencv, cuda, cudnn, and TensorRT version and install directory in config/Makefile.config located in the root direcoty of the repository. The recommaned version in this repository is opencv==4.x, cuda==11.6, cudnn==8.9, TensorRT==8.6.1.6

# Please change the cuda version if needed
# In default, cuDNN library is located in /usr/local/cuda/lib64
CXX                         :=  g++
CUDA_VER                    :=  11

# Please modify the opencv and tensorrt install directory
OPENCV_INSTALL_DIR          :=  /usr/local/include/opencv4
TENSORRT_INSTALL_DIR        :=  /home/kalfazed/packages/TensorRT-8.6.1.6

Besides, please also change the ARCH in config/Makefile.config. This parameter will be used by nvcc, which is a compiler for cuda program.

How to run

Inside each subfolder of each chapter, the basic directory structure is as follow: (For some chapters, it will be different)

|-config
    |- Makefile.config
|-src
    |- cpp
        |- xxx.c
    |- python
        |- yyy.py
|-Makefile

Please run make firstly, then it will generate a binary named trt-cuda or trt-infer, depending on different chapters. Pleae run the binary directly or run make run command.

Chapter description

chapter1-build-environment

chapter2-cuda-programming

chapter3-tensorrt-basics-and-onnx

chapter4-tensorrt-optimiztion

chapter5-tensorrt-api-basics

chapter6-deploy-classification-and-inference-design

chapter7-deploy-yolo-detection