RetSeg3D: Retention-based 3D Semantic Segmentation for Autonomous Driving

This is the official PyTorch implementation of the paper RetSeg3D: Retention-based 3D Semantic Segmentation for Autonomous Driving, by Gopi Krishna Erabati and Helder Araujo.

G. K. Erabati and H. Araujo, "RetSeg3D: Retention-based 3D Semantic Segmentation for Autonomous Driving," in Computer Vision and Image Understanding (CVIU), 2024. https://doi.org/10.1016/j.cviu.2024.104231

Contents

Overview

LiDAR semantic segmentation is one of the crucial tasks for scene understanding in autonomous driving. Recent trends suggest that voxel- or fusion-based methods obtain improved performance. However, the fusion-based methods are computationally expensive. On the other hand, the voxel-based methods uniformly employ local operators (e.g., 3D SparseConv) without considering the varying-density property of LiDAR point clouds, which result in inferior performance, specifically on far away sparse points due to limited receptive field. To tackle this issue, we propose novel retention block to capture long range dependencies and maintain the receptive field of far away sparse points and design RetSeg3D, a retention-based 3D semantic segmentation model for autonomous driving. Instead of vanilla attention mechanism to model long range dependencies, inspired by RetNet, we design cubic window multi-scale retentive self-attention (CW-MSRetSA) module with bidirectional and 3D explicit decay mechanism to introduce 3D spatial distance related prior information into the model to improve not only the receptive field but also the model capacity. Our novel retention block maintains the receptive field which significantly improve the performance of far away sparse points. We conduct extensive experiments and analysis on three large-scale datasets: SemanticKITTI, nuScenes and Waymo. Our method not only outperform existing methods on far away sparse points but also on close and medium distance points and efficiently runs in real time at 52.1 FPS.

Results

Predictions on Waymo dataset

Predictions on SemanticKITTI, Waymo and nuScenes datasets

Quantiative Results (mIoU)

RetSeg3D	SemanticKITTI	nuScenes	Waymo
mIoU	70.3	76.9	70.1
Config	retseg3d_semantickitti.py	retseg3d_nus.py	retseg3d_waymo.py
Model	weights	weights

We can not distribute the model weights on Waymo dataset due to the Waymo license terms.

Requirements, Installation and Usage

Prerequisites

The code is tested on the following configuration:

Ubuntu 20.04.6 LTS
CUDA==11.7
Python==3.8.10
PyTorch==2.0.1
mmcv==2.1.0
mmengine==0.10.1
mmdet==3.2.0
mmdet3d==1.3.0

Installation

mkvirtualenv retseg3d

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
pip install -U openmim
mim install mmcv==2.1.0
mim install mmengine==0.10.1

pip install -r requirements.txt

Data

Follow MMDetection3D-1.3.0 to prepare the SemanticKITTI and nuScenes datasets. Follow Pointcept for Waymo data prepreocessing and then run python tools/create_waymo_semantic_info.py /path/to/waymo/preprocess/dir to generate the .pkl files required for the config.

Warning: Please strictly follow MMDetection3D-1.3.0 code to prepare the data because other versions of MMDetection3D have different coordinate refactoring.

Clone the repository

git clone https://github.com/gopi-erabati/RetSeg3D.git
cd RetSeg3D

Training

SemanticKITTI dataset

Single GPU training
1. Add the present working directory to PYTHONPATH export PYTHONPATH=$(pwd):$PYTHONPATH
2. python tools/train.py configs/retseg3d_semantickitti.py --work-dir {WORK_DIR}.
Multi GPU training tools/dist_train.sh configs/retseg3d_semantickitti.py {GPU_NUM} --work-dir {WORK_DIR}.

nuScenes dataset

Single GPU training
1. Add the present working directory to PYTHONPATH export PYTHONPATH=$(pwd):$PYTHONPATH
2. python tools/train.py configs/retseg3d_nus.py --work-dir {WORK_DIR}.
Multi GPU training tools/dist_train.sh configs/retseg3d_nus.py {GPU_NUM} --work-dir {WORK_DIR}.

Waymo dataset

Single GPU training
1. Add the present working directory to PYTHONPATH export PYTHONPATH=$(pwd):$PYTHONPATH
2. python tools/train.py configs/retseg3d_waymo.py --work-dir {WORK_DIR}
Multi GPU training tools/dist_train.sh configs/retseg3d_waymo.py {GPU_NUM} --work-dir {WORK_DIR}

Testing

SemanticKITTI dataset

Single GPU testing
1. Add the present working directory to PYTHONPATH export PYTHONPATH=$(pwd):$PYTHONPATH
2. python tools/test.py configs/retseg3d_semantickitti.py /path/to/ckpt --work-dir {WORK_DIR}
Multi GPU testing ./tools/dist_test.sh configs/retseg3d_semantickitti.py /path/to/ckpt {GPU_NUM} --work-dir {WORK_DIR}.

nuScenes dataset

Single GPU testing
1. Add the present working directory to PYTHONPATH export PYTHONPATH=$(pwd):$PYTHONPATH
2. python tools/test.py configs/retseg3d_nus.py /path/to/ckpt --work-dir {WORK_DIR}.
Multi GPU testing ./tools/dist_test.sh configs/retseg3d_nus.py /path/to/ckpt {GPU_NUM} --work-dir {WORK_DIR}.

Waymo dataset

Single GPU testing
1. Add the present working directory to PYTHONPATH export PYTHONPATH=$(pwd):$PYTHONPATH
2. python tools/test.py configs/retseg3d_waymo.py /path/to/ckpt --work-dir {WORK_DIR}.
Multi GPU testing ./tools/dist_test.sh configs/retseg3d_waymo.py /path/to/ckpt {GPU_NUM} --work-dir {WORK_DIR}.

Acknowlegements

We sincerely thank the contributors for their open-source code: MMDetection3D and Pointcept.

Reference

@article{ERABATI2024104231,
title = {RetSeg3D: Retention-based 3D semantic segmentation for autonomous driving},
journal = {Computer Vision and Image Understanding},
pages = {104231},
year = {2024},
issn = {1077-3142},
doi = {https://doi.org/10.1016/j.cviu.2024.104231},
url = {https://www.sciencedirect.com/science/article/pii/S1077314224003126},
author = {Gopi Krishna Erabati and Helder Araujo},
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
mmdet3d_plugin		mmdet3d_plugin
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RetSeg3D: Retention-based 3D Semantic Segmentation for Autonomous Driving

Overview

Results

Predictions on Waymo dataset

Predictions on SemanticKITTI, Waymo and nuScenes datasets

Quantiative Results (mIoU)

Requirements, Installation and Usage

Prerequisites

Installation

Data

Clone the repository

Training

SemanticKITTI dataset

nuScenes dataset

Waymo dataset

Testing

SemanticKITTI dataset

nuScenes dataset

Waymo dataset

Acknowlegements

Reference

About

Languages

License

gopi-erabati/RetSeg3D

Folders and files

Latest commit

History

Repository files navigation

RetSeg3D: Retention-based 3D Semantic Segmentation for Autonomous Driving

Overview

Results

Predictions on Waymo dataset

Predictions on SemanticKITTI, Waymo and nuScenes datasets

Quantiative Results (mIoU)

Requirements, Installation and Usage

Prerequisites

Installation

Data

Clone the repository

Training

SemanticKITTI dataset

nuScenes dataset

Waymo dataset

Testing

SemanticKITTI dataset

nuScenes dataset

Waymo dataset

Acknowlegements

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Languages