Our classification code is developed on top of PVT.
For details please see Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer.
If you use this code for a paper please cite:
@inproceedings{zeng2022not,
title={Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer},
author={Zeng, Wang and Jin, Sheng and Liu, Wentao and Qian, Chen and Luo, Ping and Ouyang, Wanli and Wang, Xiaogang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={11101--11111},
year={2022}
}
First, clone the repository locally:
git clone https://github.com/zengwang430521/TCFormer.git
Then, install PyTorch 1.6.0+ and torchvision 0.7.0+ and pytorch-image-models 0.3.2:
conda install -c pytorch pytorch torchvision
pip install timm==0.3.2
Then install mmcv:
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html
Download and extract ImageNet train and val images from http://image-net.org/.
The directory structure is the standard layout for the torchvision datasets.ImageFolder
, and the training and validation data is expected to be in the train/
folder and val
folder respectively:
/path/to/imagenet/
train/
class1/
img1.jpeg
class2/
img2.jpeg
val/
class1/
img3.jpeg
class/2
img4.jpeg
- TCFormer on ImageNet-1K
Method | Size | Acc@1 | #Params (M) | Config | Checkpoint | log |
---|---|---|---|---|---|---|
TCFormer-light | 224 | 79.4 | 14.2M | config | 57M [Google] | [Google] |
TCFormer | 224 | 82.3 | 25.6M | config | 103M [Google] | [Google] |
TCFormer-large | 224 | 83.6 | 62.8M | config | 103M [Google] | [Google] |
To evaluate a pre-trained PVT-Small on ImageNet val with a single GPU run:
sh dist_train.sh configs/tcformer/tcformer.py 1 --data-path /path/to/imagenet --resume /path/to/checkpoint_file --eval
This should give
* Acc@1 82.346 Acc@5 95.982 loss 0.798
Accuracy of the network on the 50000 test images: 82.3%
To train TCFormer on ImageNet on a single node with 8 gpus for 300 epochs run:
sh dist_train.sh configs/tcformer/tcformer.py 8 --data-path /path/to/imagenet
If you can train on a cluster managed with slurm, you can use the script slurm_train.sh.
./slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
Here is an example of using 16 GPUs to train TCFormer on the dev partition in a slurm cluster. (Use GPUS_PER_NODE=8 to specify a single slurm cluster node with 8 GPUs, CPUS_PER_TASK=2 to use 2 cpus per task. Assume that Test is a valid ${PARTITION} name.)
GPUS=16 GPUS_PER_NODE=8 CPUS_PER_TASK=2 ./slurm_train.sh Test tcformer configs/tcformer/tcformer.py work_dirs/tcformer
This repository is released under the Apache 2.0 license as found in the LICENSE file.