COS3D: Collaborative Open-Vocabulary 3D Segmentation

This is the official PyTorch implementation of the following publication:

COS3D: Collaborative Open-Vocabulary 3D Segmentation
Runsong Zhu, Ka-Hei Hui, Zhengzhe Liu, Qianyi Wu, Weiliang Tang, Shi Qiu, Pheng-Ann Heng, Chi-Wing Fu.
NeurIPS 2025
Paper (NeurIPS) |Paper (ArXiv)

Introduction

TL;DR:

1). This paper contributes a novel and effective collaborative prompt-segmentation framework (COS3D) for the 3D open-vocabulary segmentation task.

2). Extensive experiments demonstrate that i) it not only significantly outperforms existing baselines with superior training efficiency ii) but also shows high potential for various applications, such as novel image-based 3D segmentation, hierarchical segmentation, and robotics.

Teaser

Overview

Applications

Requirements

The code has been tested on:

Ubuntu 20.04
CUDA 11.8
Python 3.8.18
Pytorch 1.12.1
GeForce RTX 4090.

Installation

Cloning the Repository

The repository contains submodules, thus please check it out with

# HTTPS
git https://github.com/Runsong123/COS3D.git --recursive

Environment Setup

Our default, provided install method is based on Conda package and environment management:

conda env create --file environment.yml
conda activate COS3D

Then, download the checkpoints of SAM from here and place it in the ckpts/ directory.

Pre-processing:

Downloading/Preparing the dataset (images + segmentation/Language features via a 2D foundation model).
(Pre-processing) obtaining the 3DGS from the given images.

Training

Our training process consists of two steps:

Stage 1: Training instance field.
Stage 2: Instance2language mapping.

For simplicity, you can run the following training script:

cd ./script/train && bash train.sh

Inference

Our inference process consists of three main steps:

3D grounding for given queries.
Render images for novel views.
Exporting the metrics.

For simplicity, you can run the following inference script:

cd ./script/infer && bash infer.sh

Data and checkpoint

You can download the LERF dataset from this OneDrive / Baidu (provided by OpenGaussians). Additionally, we provide our COS3D checkpoint for quick testing.

TODO list

Release training code
Release evaluation code
Release the preprocessing code to support various 2D foundation models (e.g., SAM2 and Semantic-SAM for segmentation results, and SigLIP for language features).
Release the applications code (e.g., novel image-based query).

This repository is still under construction. Please feel free to open issues or submit pull requests. We appreciate all contributions to this project.

Citation

@article{zhu2025cos3d,
  title={COS3D: Collaborative Open-Vocabulary 3D Segmentation},
  author={Zhu, Runsong and Hui, Ka-Hei and Liu, Zhengzhe and Wu, Qianyi and Tang, Weiliang and Qiu, Shi and Heng, Pheng-Ann and Fu, Chi-Wing},
  journal={arXiv preprint arXiv:2510.20238},
  year={2025}
}

Related Projects

Some code snippets are borrowed from OpenGaussian, Langsplat, GAGS, Unified-Lift. We thank the authors for releasing their code.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
__pycache__		__pycache__
arguments		arguments
assets		assets
autoencoder		autoencoder
configs		configs
eval		eval
gaussian_renderer		gaussian_renderer
lpipsPyTorch		lpipsPyTorch
results/results_COS3D		results/results_COS3D
scene		scene
script		script
submodules		submodules
utils		utils
Grounding_3D.py		Grounding_3D.py
LICENSE.md		LICENSE.md
README.md		README.md
compute_lerf_iou_check.py		compute_lerf_iou_check.py
construct_collobrative_kernel.py		construct_collobrative_kernel.py
environment.yaml		environment.yaml
render_3D_prompt.py		render_3D_prompt.py
train_instance.py		train_instance.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

COS3D: Collaborative Open-Vocabulary 3D Segmentation

Introduction

Teaser

Overview

Applications

Requirements

Installation

Cloning the Repository

Environment Setup

Pre-processing:

Training

For simplicity, you can run the following training script:

Inference

For simplicity, you can run the following inference script:

Data and checkpoint

TODO list

Citation

Related Projects

About

Uh oh!

Releases

Packages

Languages

License

Runsong123/COS3D

Folders and files

Latest commit

History

Repository files navigation

COS3D: Collaborative Open-Vocabulary 3D Segmentation

Introduction

Teaser

Overview

Applications

Requirements

Installation

Cloning the Repository

Environment Setup

Pre-processing:

Training

For simplicity, you can run the following training script:

Inference

For simplicity, you can run the following inference script:

Data and checkpoint

TODO list

Citation

Related Projects

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages