Skip to content

Latest commit

 

History

History
148 lines (104 loc) · 5.21 KB

README.md

File metadata and controls

148 lines (104 loc) · 5.21 KB

MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling

This is the official PyTorch implementation code for MoGenTS. For technical details, please refer to:

MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling
Weihao Yuan, Weichao Shen, Yisheng HE, Xiaodong Gu, Zilong Dong, Liefeng Bo, Qixing Huang
NeurIPS 2024
[Project Page] | [Paper]

  

Bibtex

If you find this code useful in your research, please cite:

@inproceedings{yuan2024mogents,
    title={MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling},
    author={Weihao Yuan and Weichao Shen and Yisheng HE and Yuan Dong and Xiaodong Gu and Zilong Dong and Liefeng Bo and Qixing Huang},
    booktitle = {Neural Information Processing Systems (NeurIPS)},
    pages={},
    year={2024},
}

Contents

  1. Environment
  2. Dependencies
  3. Demo
  4. Training
  5. Evaluation

Environment

  • Install Conda Environment
conda env create -f environment.yml
conda activate momask
pip install git+https://github.com/openai/CLIP.git
  • Or Install with Pip Install
conda create -n mogents python=3.8
pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Dependencies

1. Download the pretrained models

Download the models and place at ./logs/humanml3d/

Model FID
HumanML3D 0.028
KIT-ML 0.135

2. Evaluation Models and Gloves

  • Follow previous method to prepare the evaluation models and gloves. Or directly download from here and place to ./checkpoints

3. Dataset (Only for training)

  • HumanML3D - Follow the instruction in HumanML3D, then place the result dataset to ./dataset/HumanML3D.

  • KIT - Download from HumanML3D, then place the dataset in ./dataset/KIT-ML

Demo

  

python demo_mogen.py --gpu_id 0 --ext exp1 --text_prompt "A person is walking on a circle." --checkpoints_dir logs --dataset_name humanml3d --mtrans_name pretrain_mtrans --rtrans_name pretrain_rtrans

Some parameters explanation:

  • --repeat_times: number of replications for generation, default 1.
  • --motion_length: specify the number of poses for generation.

Output explanation in ./outputs/exp1/:

  • numpy files: generated motions with shape of (nframe, 22, 3), under subfolder ./joints.
  • video files: stick figure animation in mp4 format, under subfolder ./animation.
  • bvh files: bvh files of the generated motion, under subfolder ./animation.

Then you can follow MoMask to retarget the generated motion to other 3D characters for visualization.

Training

  1. Train the VQVAE
bash run_rvq.sh vq 0 humanml3d --batch_size 256 --num_quantizers 6 --max_epoch 50 --quantize_dropout_prob 0.2 --gamma 0.1 --code_dim2d 1024 --nb_code2d 256
  1. Train the Mask Transformer
bash run_mtrans.sh mtrans 4 humanml3d --vq_name vq --batch_size 384 --max_epoch 2000 --attnj --attnt
  1. Train the Residual Transformer
bash run_rtrans.sh rtrans 2 humanml3d --batch_size 64 --vq_name vq --cond_drop_prob 0.01 --share_weight --max_epoch 2000 --attnj --attnt

Evaluation

  1. Evaluate the VQVAE
python eval_vq.py --gpu_id 0 --name pretrain_vq --dataset_name humanml3d --ext eval --which_epoch net_best_fid.tar
  1. Evaluate the Mask Transformer
python eval_mask.py --dataset_name humanml3d --mtrans_name pretrain_mtrans --gpu_id 0 --cond_scale 4 --time_steps 10 --ext eval --which_epoch fid
  1. Evaluate Mask + Residual Transformer

Humanml3D:

python eval_res.py --gpu_id 0 --dataset_name humanml3d --mtrans_name pretrain_mtrans --rtrans_name pretrain_rtrans --cond_scale 4 --time_steps 10 --ext eval --which_ckpt net_best_fid.tar --which_epoch fid --traverse_res

KIT-ML:

python eval_res.py --gpu_id 0 --dataset_name kit --mtrans_name pretrain_mtrans_kit --rtrans_name pretrain_rtrans_kit --cond_scale 4 --time_steps 10 --ext eval --which_ckpt net_best_fid.tar --which_epoch fid --traverse_res

Acknowledgements

We sincerely thank the open-sourcing of these excellent works where our code is based on:

MoMask

License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including SMPL, SMPL-X, PyTorch3D, and uses datasets which each have their own respective licenses that must also be followed.