Skip to content

Guide to deploying YOLOv10 on NVIDIA Triton Inference Server for Jetson devices with JetPack 5.1.3. Covers exporting YOLOv10 from PyTorch to ONNX, converting to TensorRT, and setting up Triton. Includes client setup for real-time inference. Start by cloning the repo and following the provided steps.

Notifications You must be signed in to change notification settings

thanhlnbka/yolov10-triton-jetson

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YOLOv10 Triton Jetson

Introduction

This project provides guidance on exporting the YOLOv10 model from PyTorch to ONNX, converting it to a TensorRT engine, and deploying it on the NVIDIA Triton Inference Server on a Jetson device with JetPack 5.1.3

Instructions

  1. Clone the Project and Set Up the Environment

    First, clone the project repository:

    git clone https://github.com/thanhlnbka/yolov10-triton-jetson.git

    Next, download and extract the Triton server for JetPack:

    wget https://github.com/triton-inference-server/server/releases/download/v2.34.0/tritonserver2.34.0-jetpack5.1.tgz
    tar -xvzf tritonserver2.34.0-jetpack5.1.tgz
    cp -r clients yolov10-triton-jetson/.
    cp -r tritonserver yolov10-triton-jetson/servers/.

I. Set Up the Triton Server on Jetson

Export YOLOv10 Model from PyTorch to ONNX

  1. Build env-yolov10:

    git clone https://github.com/THU-MIG/yolov10.git
    cd yolov10/docker
    docker build -t yolov10 -f Dockerfile-jetson --name env-yolov10 .
  2. Using env-yolov10 for export ONNX

    docker run -it --network host --gpus all --runtime nvidia --name env-yolov10 yolov10 bash
    yolo export model=jameslahm/yolov10m.pt format=onnx opset=13 simplify
  3. Copy the Model to the Woring Directory

    docker cp env-yolov10:/usr/src/ultralytics/jameslahm yolov10-triton-jetson/servers/models

Convert ONNX Model to TensorRT Engine

  1. Run the TensorRT Docker Container:

    docker run -it -v yolov10-triton-jetson/servers:/servers --network host --gpus all --runtime nvidia --name env-engine nvcr.io/nvidia/l4t-tensorrt:r8.5.2.2-devel bash
  2. Convert ONNX to TensorRT:

    cd /servers/models
    /usr/src/tensorrt/bin/trtexec --onnx=yolov10m.onnx --saveEngine=yolov10m.engine --fp16 --useCudaGraph

Deploy YOLOv10 on the Triton Server

  1. Setup Triton

    docker exec -it env-engine bash
    cd /servers
    sh setup_triton.sh 
    mkdir -p model_repository/yolov10m/1
    cp /servers/models/yolov10m.engine /servers/model_repository/yolov10m/1/model.plan
  2. Start Triton

    tritonserver --model-repository=/servers/model_repository  --backend-directory=/servers/tritonserver/backends --log-verbose=1

II. Set Up the Client to Communicate with Triton

  1. Required Environment for Building Source Code

    • OpenCV
    • RapidJSON
    • CURL
  2. Build the Client Source Code:

    cd yolov10-triton-jetson
    mkdir build
    cd build
    cmake .. && make 
  3. Test the Client:

    ./triton-client <path_to_image>
  4. Demo Result:

    all_about_people_cover.jpeg

References

About

Guide to deploying YOLOv10 on NVIDIA Triton Inference Server for Jetson devices with JetPack 5.1.3. Covers exporting YOLOv10 from PyTorch to ONNX, converting to TensorRT, and setting up Triton. Includes client setup for real-time inference. Start by cloning the repo and following the provided steps.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published