(ICCV 2023) AttT2M

Code of ICCV 2023 paper: "AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism"

The pre-train model and train/eval method are Updated. Please see below for more details.

If our paper or code is helpful to you, please cite our paper：

@InProceedings{Zhong_2023_ICCV,
    author    = {Zhong, Chongyang and Hu, Lei and Zhang, Zihao and Xia, Shihong},
    title     = {AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {509-519}
}

1. Results

1.1 Visual Results

Text-driven motion generation

Compare with SOTA

Generation diversity

Fine-grained generation

1.2 Quantitative Results

For more results, please refer to our [Demo])

2. Installation

2.1. Environment

conda env create -f environment.yml
conda activate Att-T2M

The code was tested on Python 3.8 and PyTorch 1.8.1.

2.2. Datasets and others

We use two dataset: HumanML3D and KIT-ML. For both datasets, the details about them can be found [here].
Motion & text feature extractors are also provided by t2m to evaluate our generated motions

3. Quick Start

1.First step: Download the pre-train models from Google Drive

pretrain_models/
   ├── HumanML3D/
      ├── Trans/
         ├──net_best_fid.pth
         ├──run.log
      ├── VQVAE/
         ├──net_last.pth
   ├── KIT/
      ├── Trans/
         ├──net_last_290000.pth
         ├──run.log
      ├── VQVAE/
         ├──net_last.pth

Second step:Download other models from Google Drive

3.Third step:run the visualize script：

python vis.py

4. Train

Preparation: you need to download the necessary material from Google Drive:material1, material2

4.1. VQ-VAE

The VAVAE trian parameters are almost the same as T2M GPT

VQ training

python3 train_vq.py \
--batch-size 256 \
--lr 2e-4 \
--total-iter 300000 \
--lr-scheduler 200000 \
--nb-code 512 \
--down-t 2 \
--depth 3 \
--dilation-growth-rate 3 \
--out-dir output \
--dataname t2m \
--vq-act relu \
--quantizer ema_reset \
--loss-vel 0.5 \
--recons-loss l1_smooth \
--exp-name VQVAE

4.2. GPT

The results are saved in the folder output.

GPT training

python3 train_t2m_trans.py  \
--num_layers_cross 2 \
--exp-name GPT \
--batch-size 128 \
--num-layers 9 \
--embed-dim-gpt 1024 \
--nb-code 512 \
--n-head-gpt 16 \
--block-size 51 \
--ff-rate 4 \
--drop-out-rate 0.1 \
--resume-pth output/VQVAE/net_last.pth \
--vq-name VQVAE \
--out-dir output \
--total-iter 300000 \
--lr-scheduler 150000 \
--lr 0.0001 \
--dataname t2m \
--down-t 2 \
--depth 3 \
--quantizer ema_reset \
--eval-iter 10000 \
--pkeep 0.5 \
--dilation-growth-rate 3 \
--vq-act relu

5. Evaluation

GPT eval

python3 GPT_eval_multi.py  \
--exp-name TEST_GPT \
--batch-size 128 \
--num-layers 9 \
--num_layers_cross 2 \
--embed-dim-gpt 1024 \
--nb-code 512 \
--n-head-gpt 16 \
--block-size 51 \
--ff-rate 4 \
--drop-out-rate 0.1 \
--resume-pth output/VQVAE/net_last.pth \
--vq-name VQVAE \
--out-dir output \
--total-iter 300000 \
--lr-scheduler 150000 \
--lr 0.0001 \
--dataname t2m \
--down-t 2 \
--depth 3 \
--quantizer ema_reset \
--eval-iter 10000 \
--pkeep 0.5 \
--dilation-growth-rate 3 \
--vq-act relu \
--resume-trans output/GPT/net_best_fid.pth

Please repalce "--resume-pth" and "--resume-trans" with the VQVAE and Transformer models you want to evaluate.

The evaluation for multimodality will take a long time. So for a quicker evaluation without multimodality, you can comment out line 452 and line 453 in ./utils/eval_trans.py

6. Acknowledgement

Part of the code is borrowed from public code like text-to-motion, T2M-GPT, MotionDiffuse etc.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
dataset		dataset
img		img
models		models
options		options
utils		utils
visualization		visualization
visualize		visualize
GPT_eval_multi.py		GPT_eval_multi.py
LICENSE		LICENSE
README.md		README.md
VQ_eval.py		VQ_eval.py
environment.yml		environment.yml
loadnp.py		loadnp.py
render_final.py		render_final.py
teaser.pdf		teaser.pdf
train_t2m_trans.py		train_t2m_trans.py
train_vq.py		train_vq.py
vis.py		vis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(ICCV 2023) AttT2M

1. Results

1.1 Visual Results

Text-driven motion generation

Compare with SOTA

Generation diversity

Fine-grained generation

1.2 Quantitative Results

2. Installation

2.1. Environment

2.2. Datasets and others

3. Quick Start

4. Train

4.1. VQ-VAE

4.2. GPT

5. Evaluation

6. Acknowledgement

About

Releases

Packages

Languages

License

ZcyMonkey/AttT2M

Folders and files

Latest commit

History

Repository files navigation

(ICCV 2023) AttT2M

1. Results

1.1 Visual Results

Text-driven motion generation

Compare with SOTA

Generation diversity

Fine-grained generation

1.2 Quantitative Results

2. Installation

2.1. Environment

2.2. Datasets and others

3. Quick Start

4. Train

4.1. VQ-VAE

4.2. GPT

5. Evaluation

6. Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages