Skip to content

weihaosky/mogents

Repository files navigation

MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling

This is the official PyTorch implementation code for MoGenTS. For technical details, please refer to:

MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling
Weihao Yuan, Weichao Shen, Yisheng HE, Xiaodong Gu, Zilong Dong, Liefeng Bo, Qixing Huang
NeurIPS 2024
[Project Page] | [Paper]

  

Bibtex

If you find this code useful in your research, please cite:

@inproceedings{yuan2024mogents,
    title={MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling},
    author={Weihao Yuan and Weichao Shen and Yisheng HE and Yuan Dong and Xiaodong Gu and Zilong Dong and Liefeng Bo and Qixing Huang},
    booktitle = {Neural Information Processing Systems (NeurIPS)},
    pages={},
    year={2024},
}

Contents

  1. Environment
  2. Dependencies
  3. Demo
  4. Training
  5. Evaluation

Environment

  • Install Conda Environment
conda env create -f environment.yml
conda activate momask
pip install git+https://github.com/openai/CLIP.git
  • Or Install with Pip Install
conda create -n mogents python=3.8
pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Dependencies

1. Download the pretrained models

Download the models and place at ./logs/humanml3d/

Model FID
HumanML3D 0.028
KIT-ML 0.135

2. Evaluation Models and Gloves

  • Follow previous method to prepare the evaluation models and gloves. Or directly download from here and place to ./checkpoints

3. Dataset (Only for training)

  • HumanML3D - Follow the instruction in HumanML3D, then place the result dataset to ./dataset/HumanML3D.

  • KIT - Download from HumanML3D, then place the dataset in ./dataset/KIT-ML

Demo

  

python demo_mogen.py --gpu_id 0 --ext exp1 --text_prompt "A person is walking on a circle." --checkpoints_dir logs --dataset_name humanml3d --mtrans_name pretrain_mtrans --rtrans_name pretrain_rtrans

Some parameters explanation:

  • --repeat_times: number of replications for generation, default 1.
  • --motion_length: specify the number of poses for generation.

Output explanation in ./outputs/exp1/:

  • numpy files: generated motions with shape of (nframe, 22, 3), under subfolder ./joints.
  • video files: stick figure animation in mp4 format, under subfolder ./animation.
  • bvh files: bvh files of the generated motion, under subfolder ./animation.

Then you can follow MoMask to retarget the generated motion to other 3D characters for visualization.

Training

  1. Train the VQVAE
bash run_rvq.sh vq 0 humanml3d --batch_size 256 --num_quantizers 6 --max_epoch 50 --quantize_dropout_prob 0.2 --gamma 0.1 --code_dim2d 1024 --nb_code2d 256
  1. Train the Mask Transformer
bash run_mtrans.sh mtrans 4 humanml3d --vq_name vq --batch_size 384 --max_epoch 2000 --attnj --attnt
  1. Train the Residual Transformer
bash run_rtrans.sh rtrans 2 humanml3d --batch_size 64 --vq_name vq --cond_drop_prob 0.01 --share_weight --max_epoch 2000 --attnj --attnt

Evaluation

  1. Evaluate the VQVAE
python eval_vq.py --gpu_id 0 --name pretrain_vq --dataset_name humanml3d --ext eval --which_epoch net_best_fid.tar
  1. Evaluate the Mask Transformer
python eval_mask.py --dataset_name humanml3d --mtrans_name pretrain_mtrans --gpu_id 0 --cond_scale 4 --time_steps 10 --ext eval --which_epoch fid
  1. Evaluate Mask + Residual Transformer

Humanml3D:

python eval_res.py --gpu_id 0 --dataset_name humanml3d --mtrans_name pretrain_mtrans --rtrans_name pretrain_rtrans --cond_scale 4 --time_steps 10 --ext eval --which_ckpt net_best_fid.tar --which_epoch fid --traverse_res

KIT-ML:

python eval_res.py --gpu_id 0 --dataset_name kit --mtrans_name pretrain_mtrans_kit --rtrans_name pretrain_rtrans_kit --cond_scale 4 --time_steps 10 --ext eval --which_ckpt net_best_fid.tar --which_epoch fid --traverse_res

Acknowledgements

We sincerely thank the open-sourcing of these excellent works where our code is based on:

MoMask

License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including SMPL, SMPL-X, PyTorch3D, and uses datasets which each have their own respective licenses that must also be followed.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published