Skip to content

[IJCV 2024] MoDA: Modeling Deformable 3D Objects from Casual Videos

License

Notifications You must be signed in to change notification settings

ChaoyueSong/MoDA

Repository files navigation

MoDA: Modeling Deformable 3D Objects from Casual Videos

IJCV 2024


Installation

We test our method on torch 1.10 + cu113

# clone repo
git clone https://github.com/ChaoyueSong/MoDA.git --recursive
cd MoDA
# create conda env
conda env create -f misc/moda.yml
conda activate moda
# install pytorch3d, kmeans-pytorch
pip install -e third_party/pytorch3d
pip install -e third_party/kmeans_pytorch
# install detectron2
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

Data preparation

For casual-human (adult7) and casual-cat (cat-pikachiu) used in this work, you can download the pre-processed data as in BANMo, plz check the license for these data in BANMo.

# (~8G for each)
bash misc/processed/download.sh cat-pikachiu
bash misc/processed/download.sh human-cap

For AMA and Synthetic data, please check here.

To use your own videos, or pre-process raw videos into our format, please follow this instruction.

PoseNet weights

Download pre-trained PoseNet weights for human and quadrupeds.

mkdir -p mesh_material/posenet && cd "$_"
wget $(cat ../../misc/posenet.txt); cd ../../

Training

1. cat-pikachiu (casual-cat)

# We store images as lines of pixels following BANMo. 
# only needs to run it once per sequence and data are stored in
# database/DAVIS/Pixel
python preprocess/img2lines.py --seqname cat-pikachiu

# Training
bash scripts/template.sh 0,1 cat-pikachiu 10001 "no" "no"
# argv[1]: gpu ids separated by comma 
# args[2]: sequence name
# args[3]: port for distributed training
# args[4]: use_human, pass "" for human, "no" for others
# args[5]: use_symm, pass "" to force x-symmetric shape

# Extract articulated meshes and render
bash scripts/render_mgpu.sh 0 cat-pikachiu logdir/cat-pikachiu-e120-b256-ft2/params_latest.pth \
        "0 1 2 3 4 5 6 7 8 9 10" 256
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: weights path
# argv[4]: video id separated by space
# argv[5]: resolution of running marching cubes (256 by default)

2. adult7 (casual-human)

python preprocess/img2lines.py --seqname adult7
bash scripts/template.sh 0,1 adult7 10001 "" ""
bash scripts/render_mgpu.sh 0 adult7 logdir/adult7-e120-b256-ft2/params_latest.pth \
        "0 1 2 3 4 5 6 7 8 9" 256

TODO

  • Initial code release.
  • Code cleaning and further checking.
  • Release the pretrained models.

Citation

@article{song2024moda,
  title={Moda: Modeling deformable 3d objects from casual videos},
  author={Song, Chaoyue and Wei, Jiacheng and Chen, Tianyi and Chen, Yiwen and Foo, Chuan-Sheng and Liu, Fayao and Lin, Guosheng},
  journal={International Journal of Computer Vision},
  pages={1--20},
  year={2024},
  publisher={Springer}
}

Acknowledgments

We thank BANMo for their code and data.

Releases

No releases published

Packages

No packages published