Skip to content
/ LGM Public

[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.

License

Notifications You must be signed in to change notification settings

3DTopia/LGM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

fe8d12c · Apr 3, 2024

History

17 Commits
Feb 7, 2024
Apr 3, 2024
Feb 7, 2024
Feb 7, 2024
Feb 7, 2024
Feb 7, 2024
Feb 7, 2024
Feb 7, 2024
Feb 7, 2024
Feb 7, 2024
Feb 7, 2024
Apr 3, 2024
Feb 8, 2024

Repository files navigation

Large Multi-View Gaussian Model

This is the official implementation of LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.

demo.mp4

News

[2024.4.3] Thanks to @yxymessi and @florinshen, we have fixed a severe bug in rotation normalization here. We have finetuned the model with correct normalization for 30 more epochs and uploaded new checkpoints.

Replicate Demo:

Thanks to @camenduru!

Install

# xformers is required! please refer to https://github.com/facebookresearch/xformers for details.
# for example, we use torch 2.1.0 + cuda 11.8
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install -U xformers --index-url https://download.pytorch.org/whl/cu118

# a modified gaussian splatting (+ depth, alpha rendering)
git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization

# for mesh extraction
pip install git+https://github.com/NVlabs/nvdiffrast

# other dependencies
pip install -r requirements.txt

Pretrained Weights

Our pretrained weight can be downloaded from huggingface.

For example, to download the fp16 model for inference:

mkdir pretrained && cd pretrained
wget https://huggingface.co/ashawkey/LGM/resolve/main/model_fp16_fixrot.safetensors
cd ..

For MVDream and ImageDream, we use a diffusers implementation. Their weights will be downloaded automatically.

Inference

Inference takes about 10GB GPU memory (loading all imagedream, mvdream, and our LGM).

### gradio app for both text/image to 3D
python app.py big --resume pretrained/model_fp16.safetensors

### test
# --workspace: folder to save output (*.ply and *.mp4)
# --test_path: path to a folder containing images, or a single image
python infer.py big --resume pretrained/model_fp16.safetensors --workspace workspace_test --test_path data_test 

### local gui to visualize saved ply
python gui.py big --output_size 800 --test_path workspace_test/saved.ply

### mesh conversion
python convert.py big --test_path workspace_test/saved.ply

For more options, please check options.

Training

NOTE: Since the dataset used in our training is based on AWS, it cannot be directly used for training in a new environment. We provide the necessary training code framework, please check and modify the dataset implementation!

We also provide the ~80K subset of Objaverse used to train LGM in objaverse_filter.

# debug training
accelerate launch --config_file acc_configs/gpu1.yaml main.py big --workspace workspace_debug

# training (use slurm for multi-nodes training)
accelerate launch --config_file acc_configs/gpu8.yaml main.py big --workspace workspace

Acknowledgement

This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!

Citation

@article{tang2024lgm,
  title={LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation},
  author={Tang, Jiaxiang and Chen, Zhaoxi and Chen, Xiaokang and Wang, Tengfei and Zeng, Gang and Liu, Ziwei},
  journal={arXiv preprint arXiv:2402.05054},
  year={2024}
}