Deep Learning Nodes for ROS/ROS2

This repo contains deep learning inference nodes and camera/video streaming nodes for ROS/ROS2 with support for Jetson Nano/TX1/TX2/Xavier NX/AGX Xavier and TensorRT.

The nodes use the image recognition, object detection, and semantic segmentation DNN's from the jetson-inference library and NVIDIA Hello AI World tutorial, which come with several built-in pretrained networks for classification, detection, and segmentation and the ability to load customized user-trained models.

The camera/video streaming nodes support the following input/output interfaces:

MIPI CSI cameras
V4L2 cameras
RTP / RTSP
Videos & Images
Image sequences
OpenGL windows

ROS Melodic and ROS2 Eloquent are supported, and the latest version of JetPack is recommended.

Installation

First, install the latest version of JetPack on your Jetson.

Then, follow the steps below to install the needed components on your Jetson.

jetson-inference

These ROS nodes use the DNN objects from the jetson-inference project (aka Hello AI World). To build and install jetson-inference, see this page or run the commands below:

$ cd ~
$ sudo apt-get install git cmake
$ git clone --recursive https://github.com/dusty-nv/jetson-inference
$ cd jetson-inference
$ mkdir build
$ cd build
$ cmake ../
$ make -j$(nproc)
$ sudo make install
$ sudo ldconfig

Before proceeding, it's worthwhile to test that jetson-inference is working properly on your system by following this step of the Hello AI World tutorial:

Classifying Images with ImageNet

ROS/ROS2

Install the ros-melodic-ros-base or ros-eloquent-ros-base package on your Jetson following these directions:

ROS Melodic - ROS Install Instructions
ROS2 Eloquent - ROS2 Install Instructions

Depending on which version of ROS you're using, install some additional dependencies and create a workspace:

ROS Melodic

$ sudo apt-get install ros-melodic-image-transport ros-melodic-vision-msgs

For ROS Melodic, create a Catkin workspace (~/ros_workspace) using these steps:
http://wiki.ros.org/ROS/Tutorials/InstallingandConfiguringROSEnvironment#Create_a_ROS_Workspace

ROS Eloquent

$ sudo apt-get install ros-eloquent-vision-msgs \
                       ros-eloquent-launch-xml \
                       ros-eloquent-launch-yaml \
                       python3-colcon-common-extensions

For ROS Eloquent, create a workspace (~/ros_workspace) to use:

$ mkdir -p ~/ros2_example_ws/src

ros_deep_learning

Next, navigate into your ROS workspace's src directory and clone ros_deep_learning:

$ cd ~/ros_workspace/src
$ git clone https://github.com/dusty-nv/ros_deep_learning

Then build it - if you are using ROS Melodic, use catkin_make. If you are using ROS2 Eloquent, use colcon build:

$ cd ~/ros_workspace/

# ROS Melodic
$ catkin_make
$ source devel/setup.bash 

# ROS2 Eloquent
$ colcon build
$ source install/local_setup.bash

The nodes should now be built and ready to use. Remember to source the overlay as shown above so that ROS can find the nodes.

Testing

Before proceeding, if you're using ROS Melodic make sure that roscore is running first:

$ roscore

If you're using ROS2, running the core service is no longer required.

Video Viewer

First, it's recommended to test that you can stream a video feed using the video_source and video_output nodes. See Camera Streaming & Multimedia for valid input/output streams, and substitute your desired input and output argument below. For example, you can use video files for the input or output, or use V4L2 cameras instead of MIPI CSI cameras. You can also use RTP/RTSP streams over the network.

# ROS Melodic
$ roslaunch ros_deep_learning video_viewer.ros1.launch input:=csi://0 output:=display://0

# ROS2 Eloquent
$ ros2 launch ros_deep_learning video_viewer.ros2.launch input:=csi://0 output:=display://0

imagenet Node

You can launch a classification demo with the following commands - substitute your desired camera or video path to the input argument below (see here for valid input/output streams).

Note that the imagenet node also publishes classification metadata on the imagenet/classification topic in a vision_msgs/Detection2DArray message -- see the Topics & Parameters section below for more info.

# ROS Melodic
$ roslaunch ros_deep_learning imagenet.ros1.launch input:=csi://0 output:=display://0

# ROS2 Eloquent
$ ros2 launch ros_deep_learning imagenet.ros2.launch input:=csi://0 output:=display://0

detectnet Node

To launch an object detection demo, substitute your desired camera or video path to the input argument below (see here for valid input/output streams). Note that the detectnet node also publishes the metadata in a vision_msgs/Detection2DArray message -- see the Topics & Parameters section below for more info.

# ROS Melodic
$ roslaunch ros_deep_learning detectnet.ros1.launch input:=csi://0 output:=display://0

# ROS2 Eloquent
$ ros2 launch ros_deep_learning detectnet.ros2.launch input:=csi://0 output:=display://0

segnet Node

To launch a semantic segmentation demo, substitute your desired camera or video path to the input argument below (see here for valid input/output streams). Note that the segnet node also publishes raw segmentation results to the segnet/class_mask topic -- see the Topics & Parameters section below for more info.

# ROS Melodic
$ roslaunch ros_deep_learning segnet.ros1.launch input:=csi://0 output:=display://0

# ROS2 Eloquent
$ ros2 launch ros_deep_learning segnet.ros2.launch input:=csi://0 output:=display://0

Topics & Parameters

Below are the message topics and parameters that each node implements.

imagenet Node

Topic Name	I/O	Message Type	Description
image_in	Input	`sensor_msgs/Image`	Raw input image
classification	Output	`vision_msgs/Classification2D`	Classification results (class ID + confidence)
vision_info	Output	`vision_msgs/VisionInfo`	Vision metadata (class labels parameter list name)
overlay	Output	`sensor_msgs/Image`	Input image overlayed with the classification results

Parameter Name	Type	Default	Description
model_name	`string`	`"googlenet"`	Built-in model name (see here for valid values)
model_path	`string`	`""`	Path to custom caffe or ONNX model
prototxt_path	`string`	`""`	Path to custom caffe prototxt file
input_blob	`string`	`"data"`	Name of DNN input layer
output_blob	`string`	`"prob"`	Name of DNN output layer
class_labels_path	`string`	`""`	Path to custom class labels file
class_labels_HASH	`vector<string>`	class names	List of class labels, where HASH is model-specific (actual name of parameter is found via the `vision_info` topic)

detectnet Node

Topic Name	I/O	Message Type	Description
image_in	Input	`sensor_msgs/Image`	Raw input image
detections	Output	`vision_msgs/Detection2DArray`	Detection results (bounding boxes, class IDs, confidences)
vision_info	Output	`vision_msgs/VisionInfo`	Vision metadata (class labels parameter list name)
overlay	Output	`sensor_msgs/Image`	Input image overlayed with the detection results

Parameter Name	Type	Default	Description
model_name	`string`	`"ssd-mobilenet-v2"`	Built-in model name (see here for valid values)
model_path	`string`	`""`	Path to custom caffe or ONNX model
prototxt_path	`string`	`""`	Path to custom caffe prototxt file
input_blob	`string`	`"data"`	Name of DNN input layer
output_cvg	`string`	`"coverage"`	Name of DNN output layer (coverage/scores)
output_bbox	`string`	`"bboxes"`	Name of DNN output layer (bounding boxes)
class_labels_path	`string`	`""`	Path to custom class labels file
class_labels_HASH	`vector<string>`	class names	List of class labels, where HASH is model-specific (actual name of parameter is found via the `vision_info` topic)
overlay_flags	`string`	`"box,labels,conf"`	Flags used to generate the overlay (some combination of `none,box,labels,conf`)
mean_pixel_value	`float`	0.0	Mean pixel subtraction value to be applied to input (normally 0)
threshold	`float`	0.5	Minimum confidence value for positive detections (0.0 - 1.0)

segnet Node

Topic Name	I/O	Message Type	Description
image_in	Input	`sensor_msgs/Image`	Raw input image
vision_info	Output	`vision_msgs/VisionInfo`	Vision metadata (class labels parameter list name)
overlay	Output	`sensor_msgs/Image`	Input image overlayed with the classification results
color_mask	Output	`sensor_msgs/Image`	Colorized segmentation class mask out
class_mask	Output	`sensor_msgs/Image`	8-bit single-channel image where each pixel is a classID

Parameter Name	Type	Default	Description
model_name	`string`	`"fcn-resnet18-cityscapes-1024x512"`	Built-in model name (see here for valid values)
model_path	`string`	`""`	Path to custom caffe or ONNX model
prototxt_path	`string`	`""`	Path to custom caffe prototxt file
input_blob	`string`	`"data"`	Name of DNN input layer
output_blob	`string`	`"score_fr_21classes"`	Name of DNN output layer
class_colors_path	`string`	`""`	Path to custom class colors file
class_labels_path	`string`	`""`	Path to custom class labels file
class_labels_HASH	`vector<string>`	class names	List of class labels, where HASH is model-specific (actual name of parameter is found via the `vision_info` topic)
mask_filter	`string`	`"linear"`	Filtering to apply to color_mask topic (`linear` or `point`)
overlay_filter	`string`	`"linear"`	Filtering to apply to overlay topic (`linear` or `point`)
overlay_alpha	`float`	`180.0`	Alpha blending value used by overlay topic (0.0 - 255.0)

video_source Node

Topic Name	I/O	Message Type	Description
raw	Output	`sensor_msgs/Image`	Raw output image (BGR8)

Parameter	Type	Default	Description
resource	`string`	`"csi://0"`	Input stream URI (see here for valid protocols)
codec	`string`	`""`	Manually specify codec for compressed streams (see here for valid values)
width	`int`	0	Manually specify desired width of stream (0 = stream default)
height	`int`	0	Manually specify desired height of stream (0 = stream default)
framerate	`int`	0	Manually specify desired framerate of stream (0 = stream default)
loop	`int`	0	For video files: `0` = don't loop, `>0` = # of loops, `-1` = loop forever

video_output Node

Topic Name	I/O	Message Type	Description
image_in	Input	`sensor_msgs/Image`	Raw input image

Parameter	Type	Default	Description
resource	`string`	`"display://0"`	Output stream URI (see here for valid protocols)
codec	`string`	`"h264"`	Codec used for compressed streams (see here for valid values)
bitrate	`int`	4000000	Target VBR bitrate of encoded streams (in bits per second)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Deep Learning Nodes for ROS/ROS2

Table of Contents

Installation

jetson-inference

ROS/ROS2

ROS Melodic

ROS Eloquent

ros_deep_learning

Testing

Video Viewer

imagenet Node

detectnet Node

segnet Node

Topics & Parameters

imagenet Node

detectnet Node

segnet Node

video_source Node

video_output Node

Files

README.md

Latest commit

History

README.md

File metadata and controls

Deep Learning Nodes for ROS/ROS2

Table of Contents

Installation

jetson-inference

ROS/ROS2

ROS Melodic

ROS Eloquent

ros_deep_learning

Testing

Video Viewer

imagenet Node

detectnet Node

segnet Node

Topics & Parameters

imagenet Node

detectnet Node

segnet Node

video_source Node

video_output Node