We test our method on torch 1.10 + cu113
# clone repo
git clone https://github.com/ChaoyueSong/MoDA.git --recursive
cd MoDA
# create conda env
conda env create -f misc/moda.yml
conda activate moda
# install pytorch3d, kmeans-pytorch
pip install -e third_party/pytorch3d
pip install -e third_party/kmeans_pytorch
# install detectron2
python -m pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
For casual-human (adult7) and casual-cat (cat-pikachiu) used in this work, you can download the pre-processed data as in BANMo, plz check the license for these data in BANMo.
# (~8G for each)
bash misc/processed/download.sh cat-pikachiu
bash misc/processed/download.sh human-cap
For AMA and Synthetic data, please check here.
To use your own videos, or pre-process raw videos into our format, please follow this instruction.
Download pre-trained PoseNet weights for human and quadrupeds.
mkdir -p mesh_material/posenet && cd "$_"
wget $(cat ../../misc/posenet.txt); cd ../../
# We store images as lines of pixels following BANMo.
# only needs to run it once per sequence and data are stored in
# database/DAVIS/Pixel
python preprocess/img2lines.py --seqname cat-pikachiu
# Training
bash scripts/template.sh 0,1 cat-pikachiu 10001 "no" "no"
# argv[1]: gpu ids separated by comma
# args[2]: sequence name
# args[3]: port for distributed training
# args[4]: use_human, pass "" for human, "no" for others
# args[5]: use_symm, pass "" to force x-symmetric shape
# Extract articulated meshes and render
bash scripts/render_mgpu.sh 0 cat-pikachiu logdir/cat-pikachiu-e120-b256-ft2/params_latest.pth \
"0 1 2 3 4 5 6 7 8 9 10" 256
# argv[1]: gpu id
# argv[2]: sequence name
# argv[3]: weights path
# argv[4]: video id separated by space
# argv[5]: resolution of running marching cubes (256 by default)
python preprocess/img2lines.py --seqname adult7
bash scripts/template.sh 0,1 adult7 10001 "" ""
bash scripts/render_mgpu.sh 0 adult7 logdir/adult7-e120-b256-ft2/params_latest.pth \
"0 1 2 3 4 5 6 7 8 9" 256
- Initial code release.
- Code cleaning and further checking.
- Release the pretrained models.
@article{song2024moda,
title={Moda: Modeling deformable 3d objects from casual videos},
author={Song, Chaoyue and Wei, Jiacheng and Chen, Tianyi and Chen, Yiwen and Foo, Chuan-Sheng and Liu, Fayao and Lin, Guosheng},
journal={International Journal of Computer Vision},
pages={1--20},
year={2024},
publisher={Springer}
}
We thank BANMo for their code and data.