Skip to content

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

License

Notifications You must be signed in to change notification settings

Tencent/DepthCrafter

Repository files navigation

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Version        

Wenbo Hu1* †, Xiangjun Gao2*, Xiaoyu Li1* †, Sijie Zhao1, Xiaodong Cun1,
Yong Zhang1, Long Quan2, Ying Shan3, 1


1Tencent AI Lab 2The Hong Kong University of Science and Technology 3ARC Lab, Tencent PCG

arXiv preprint, 2024

🔆 Introduction

🤗 If you find DepthCrafter useful, please help ⭐ this repo, which is important to Open-Source projects. Thanks!

🔥 DepthCrafter can generate temporally consistent long-depth sequences with fine-grained details for open-world videos, without requiring additional information such as camera poses or optical flow.

  • [24-11-26] 🚀🚀🚀 DepthCrafter v1.0.1 is released now, with improved quality and speed
  • [24-10-19] 🤗🤗🤗 DepthCrafter now has been integrated into ComfyUI!
  • [24-10-08] 🤗🤗🤗 DepthCrafter now has been integrated into Nuke, have a try!
  • [24-09-28] Add full dataset inference and evaluation scripts for better comparison use. :-)
  • [24-09-25] 🤗🤗🤗 Add huggingface online demo DepthCrafter.
  • [24-09-19] Add scripts for preparing benchmark datasets.
  • [24-09-18] Add point cloud sequence visualization.
  • [24-09-14] 🔥🔥🔥 DepthCrafter is released now, have fun!

📦 Release Notes

  • DepthCrafter v1.0.1:
    • Quality and speed improvement
      Method ms/frame↓ @1024×576 Sintel (~50 frames) Scannet (90 frames) KITTI (110 frames) Bonn (110 frames)
      AbsRel↓ δ₁ ↑ AbsRel↓ δ₁ ↑ AbsRel↓ δ₁ ↑ AbsRel↓ δ₁ ↑
      Marigold 1070.29 0.532 0.515 0.166 0.769 0.149 0.796 0.091 0.931
      Depth-Anything-V2 180.46 0.367 0.554 0.135 0.822 0.140 0.804 0.106 0.921
      DepthCrafter previous 1913.92 0.292 0.697 0.125 0.848 0.110 0.881 0.075 0.971
      DepthCrafter v1.0.1 465.84 0.270 0.697 0.123 0.856 0.104 0.896 0.071 0.972

🎥 Visualization

We provide demos of unprojected point cloud sequences, with reference RGB and estimated depth videos. Please refer to our project page for more details.

365030500-ff625ffe-93ab-4b58-a62a-50bf75c89a92.mov

🚀 Quick Start

🤖 Gradio Demo

🌟 Community Support

  • NukeDepthCrafter: a plugin allows you to generate temporally consistent Depth sequences inside Nuke, which is widely used in the VFX industry.
  • ComfyUI-Nodes: creating consistent depth maps for your videos using DepthCrafter in ComfyUI.

🛠️ Installation

  1. Clone this repo:
git clone https://github.com/Tencent/DepthCrafter.git
  1. Install dependencies (please refer to requirements.txt):
pip install -r requirements.txt

🤗 Model Zoo

DepthCrafter is available in the Hugging Face Model Hub.

🏃‍♂️ Inference

1. High-resolution inference, requires a GPU with ~26GB memory for 1024x576 resolution:

  • ~2.1 fps on A100, recommended for high-quality results:

    python run.py  --video-path examples/example_01.mp4

2. Low-resolution inference requires a GPU with ~9GB memory for 512x256 resolution:

  • ~8.6 fps on A100:

    python run.py  --video-path examples/example_01.mp4 --max-res 512

🚀 Dataset Evaluation

Please check the benchmark folder.

  • To create the dataset we use in the paper, you need to run dataset_extract/dataset_extract_${dataset_name}.py.
  • Then you will get the csv files that save the relative root of extracted RGB video and depth npz files. We also provide these csv files.
  • Inference for all datasets scripts:
    bash benchmark/infer/infer.sh
    (Remember to replace the input_rgb_root and saved_root with your own path.)
  • Evaluation for all datasets scripts:
    bash benchmark/eval/eval.sh
    (Remember to replace the pred_disp_root and gt_disp_root with your own path.)

🤝🍻 Contributing

  • Welcome to open issues and pull requests.

  • Welcome to optimize the inference speed and memory usage, e.g., through model quantization, distillation, or other acceleration techniques.

    Contributors

Star History

Star History Chart

📜 Citation

If you find this work helpful, please consider citing:

@article{hu2024-DepthCrafter,
            author      = {Hu, Wenbo and Gao, Xiangjun and Li, Xiaoyu and Zhao, Sijie and Cun, Xiaodong and Zhang, Yong and Quan, Long and Shan, Ying},
            title       = {DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos},
            journal     = {arXiv preprint arXiv:2409.02095},
            year        = {2024}
    }