MiniCPM-o.cpp

Inference of MiniCPM-o 2.6 in plain C/C++

English | 中文 | Install | Report

Features

Plain C/C++ implementation based on ggml.
Requires only 8GB of VRAM for inference.
Supports streaming processing for both audio and video inputs.
Optimized for real-time video streaming on NVIDIA Jetson Orin Nano Super.
Provides Python bindings, a web demo, and additional integration possibilities.

Installation

Clone and initialize the repository.

# Clone the repository
git clone https://github.com/360CVGroup/MiniCPM-o.cpp.git
cd MiniCPM-o.cpp
# Initialize and update submodules
git submodule update --init --recursive

Set up the Python environment and install the package:

# We recommend using uv for Python environment and package management
pip install uv

# Create and activate a virtual environment
uv venv
source .venv/bin/activate
# For fish shell, use: source .venv/bin/activate.fish

# Install the package in editable mode
uv pip install -e . --verbose

For detailed installation steps, please refer to the installation guide.

Quick Start

1. Model Prepare

Use Pre-converted and Quantized gguf Models (Recommended). link: Google Drive or ModelScope

Download and place all models in the models/ directory.

2. Model Inference

For ease of integration, we provide a Python binding. Run the script:

# in project root path
python test/test_minicpmo.py --apm-path models/minicpmo-audio-encoder_Q4_K.gguf --vpm-path models/minicpmo-image-encoder_Q4_1.gguf --llm-path models/Model-7.6B-Q4_K_M.gguf --video-path assets/Skiing.mp4

We also provide a C/C++ interface. For details, please refer to the C++ Interface Documentation.

3. WebUI Demo

Real-time video interaction demo:

3.1 Start model server

# in project root path
uv pip install -r web_demos/minicpm-o_2.6/requirements.txt
python web_demos/minicpm-o_2.6/model_server.py

3.2 Start web server

# Make sure Node and PNPM are installed.
sudo apt-get update
sudo apt-get install nodejs npm
npm install -g pnpm

cd web_demos/minicpm-o_2.6/web_server
# create ssl cert for https, https is required to request camera and microphone permissions.
bash ./make_ssl_cert.sh  # output key.pem and cert.pem

pnpm install  # install requirements
pnpm run dev  # start server

Open https://localhost:8088/ in your browser for real-time video calls.

Deployment on Edge Device

We have deployed the MiniCPM-omni model on the NVIDIA Jetson Orin Nano Super 8G embedded device. This project supports real-time inference on NVIDIA Jetson Orin Nano Super 8Gb in MAXN SUPER mode.

If your embedded device is not running the Super system package, please refer to the installation manual for instructions on installing the system package on your board.

We recorded a video of the model running on the Jetson device in real time, with no speed-up applied.

For NVIDIA Jetson Orin Nano Super performance, including inference time and first-token latency data, see Inference Performance Optimization.

License

This project is licensed under the Apache 2.0 License. For model usage and distribution, please comply with the official model license.

Reference

llama.cpp: LLM inference in C/C++
whisper.cpp: Port of OpenAI's Whisper model in C/C++
transformers: Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
MiniCPM-o: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
3rdparty		3rdparty
assets		assets
cmake		cmake
docs		docs
examples		examples
include		include
models		models
src		src
test		test
web_demos/minicpm-o_2.6		web_demos/minicpm-o_2.6
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
LICENSE		LICENSE
README.md		README.md
README_ZH.md		README_ZH.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MiniCPM-o.cpp

English | 中文 | Install | Report

Features

Installation

Quick Start

1. Model Prepare

2. Model Inference

3. WebUI Demo

3.1 Start model server

3.2 Start web server

Deployment on Edge Device

License

Reference

About

Uh oh!

Releases

Packages

Languages

License

360CVGroup/MiniCPM-o.cpp

Folders and files

Latest commit

History

Repository files navigation

MiniCPM-o.cpp

English | 中文 | Install | Report

Features

Installation

Quick Start

1. Model Prepare

2. Model Inference

3. WebUI Demo

3.1 Start model server

3.2 Start web server

Deployment on Edge Device

License

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages