This project is in progress. I try to re-produce the impressive results of paper AlignedReID: Surpassing Human-Level Performance in Person Re-Identification using pytorch.
If you adopt AlignedReID in your research, please cite the paper
@article{zhang2017alignedreid,
title={AlignedReID: Surpassing Human-Level Performance in Person Re-Identification},
author={Zhang, Xuan and Luo, Hao and Fan, Xing and Xiang, Weilai and Sun, Yixiao and Xiao, Qiqi and Jiang, Wei and Zhang, Chi and Sun, Jian},
journal={arXiv preprint arXiv:1711.08184},
year={2017}
}
- Model
- ResNet-50
- Loss
- Triplet Global Loss
- Triplet Local Loss
- Identification Loss
- Mutual Loss
- Testing
- Re-Ranking
- Speed
- Speed up forward & backward
On Market1501 with setting
- Train only on Market1501 (While the paper combines 4 datasets.)
- Use only global distance, NOT normalizing feature to unit length, with margin 0.3
- Adam optimizer, base learning rate 2e-4, decaying exponentially after 150 epochs. Train for 300 epochs in total.
Rank-1 (%) | mAP (%) | Rank-1 (%) after Re-ranking | mAP (%) after Re-ranking | |
---|---|---|---|---|
Triplet Loss | 87.05 | 71.38 | 89.85 | 85.49 |
Triplet Loss + Mutual Loss | 88.78 | 75.76 | 91.92 | 88.27 |
Other details of setting can be found in the code. To test my trained models or reproduce these results, see the Examples section.
Embarrassingly,- Adding Identification Loss only decreases performance.
- Adding Local Distance only improves ~1 point.
- Simply combining trainval sets of three datasets does not improves performance on Market1501 (CUHK03 and DukeMTMC-reID to be tested). This indeed is a research topic.
I will try to make it work, with the help of authors.
More scores under different settings can be found in the Excel file AlignedReID-Scores.xlsx.
This repository contains following resources
- A beginner-level dataset interface independent of Pytorch, Tensorflow, etc, supporting multi-thread prefetching (README file is under way)
- Three most used ReID datasets, Market1501, CUHK03 (new protocol) and DukeMTMC-reID
- Python version ReID evaluation code (Originally from open-reid)
- Python version Re-ranking (Originally from re_ranking)
- Triplet Loss training examples
- Deep Mutual Learning examples
- AlignedReID (performance stays tuned)
It's recommended that you create and enter a python virtual environment before installing our package. I personally use Anaconda which contains python and many useful packages.
git clone https://github.com/huanghoujing/AlignedReID-Re-Production-Pytorch.git
cd AlignedReID-Re-Production-Pytorch
I use Python 2.7 and Pytorch 0.1.12. For installing Pytorch 0.1.12, follow the official guide. Other packages are specified in requirements.txt
.
pip install -r requirements.txt
Then install this project:
python setup.py install --record installed_files.txt
NOTE: Every time you modify files in directory aligned_reid
, you have to install the package again by python setup.py install --record installed_files.txt
. Because scripts that import from this aligned_reid
package in fact import from site-packages (Determined by where you run the script? Not quite sure.), you have to re-install to update it.
Inspired by Tong Xiao's open-reid project, dataset directories are refactored to support a unified dataset interface.
Transformed dataset has following features
- All used images, including training and testing images, are inside the same folder named
images
- The train/val/test partitions are recorded in a file named
partitions.pkl
which is a dict with the following keys'trainval_im_names'
'trainval_ids2labels'
'train_im_names'
'train_ids2labels'
'val_im_names'
'val_marks'
'test_im_names'
'test_marks'
- Validation set consists of 100 persons (configurable during transforming dataset) unseen in training set, and validation follows the same ranking protocol of testing.
- Each val or test image is accompanied by a mark denoting whether it is from
- query (
mark == 0
), or - gallery (
mark == 1
), or - multi query (
mark == 2
) set
- query (
You can download what I have transformed for the project from Google Drive or BaiduYun. Otherwise, you can download the original dataset and transform it using my script, described below.
Download the Market1501 dataset from here. Run the following script to transform the dataset, replacing the paths with yours.
python script/dataset/transform_market1501.py \
--zip_file ~/Dataset/market1501/Market-1501-v15.09.15.zip \
--save_dir ~/Dataset/market1501
We follow the new training/testing protocol proposed in paper
@article{zhong2017re,
title={Re-ranking Person Re-identification with k-reciprocal Encoding},
author={Zhong, Zhun and Zheng, Liang and Cao, Donglin and Li, Shaozi},
booktitle={CVPR},
year={2017}
}
Details of the new protocol can be found here.
You can download what I have transformed for the project from Google Drive or BaiduYun. Otherwise, you can download the original dataset and transform it using my script, described below.
Download the CUHK03 dataset from here. Then download the training/testing partition file from Google Drive or BaiduYun. This partition file specifies which images are in training, query or gallery set. Finally run the following script to transform the dataset, replacing the paths with yours.
python script/dataset/transform_cuhk03.py \
--zip_file ~/Dataset/cuhk03/cuhk03_release.zip \
--train_test_partition_file ~/Dataset/cuhk03/re_ranking_train_test_split.pkl \
--save_dir ~/Dataset/cuhk03
You can download what I have transformed for the project from Google Drive or BaiduYun. Otherwise, you can download the original dataset and transform it using my script, described below.
Download the DukeMTMC-reID dataset from here. Run the following script to transform the dataset, replacing the paths with yours.
python script/dataset/transform_duke.py \
--zip_file ~/Dataset/duke/DukeMTMC-reID.zip \
--save_dir ~/Dataset/duke
Larger training set tends to benefit deep learning models, so I combine trainval set of three datasets Market1501, CUHK03 and DukeMTMC-reID. After training on the combined trainval set, the model can be tested on three test sets as usual.
Transform three separate datasets as introduced above if you have not done it.
For the trainval set, you can download what I have transformed from Google Drive or BaiduYun. Otherwise, you can run the following script to combine the trainval sets, replacing the paths with yours.
python script/dataset/combine_trainval_sets.py \
--market1501_im_dir ~/Dataset/market1501/images \
--market1501_partition_file ~/Dataset/market1501/partitions.pkl \
--cuhk03_im_dir ~/Dataset/cuhk03/detected/images \
--cuhk03_partition_file ~/Dataset/cuhk03/detected/partitions.pkl \
--duke_im_dir ~/Dataset/duke/images \
--duke_partition_file ~/Dataset/duke/partitions.pkl \
--save_dir ~/Dataset/market1501_cuhk03_duke
The project requires you to configure the dataset paths. In aligned_reid/tri_loss/dataset/__init__.py
, modify the following snippet according to your saving paths used in preparing datasets.
# In file aligned_reid/tri_loss/dataset/__init__.py
########################################
# Specify Directory and Partition File #
########################################
if name == 'market1501':
im_dir = ospeu('~/Dataset/market1501/images')
partition_file = ospeu('~/Dataset/market1501/partitions.pkl')
elif name == 'cuhk03':
im_type = ['detected', 'labeled'][0]
im_dir = ospeu(ospj('~/Dataset/cuhk03', im_type, 'images'))
partition_file = ospeu(ospj('~/Dataset/cuhk03', im_type, 'partitions.pkl'))
elif name == 'duke':
im_dir = ospeu('~/Dataset/duke/images')
partition_file = ospeu('~/Dataset/duke/partitions.pkl')
elif name == 'combined':
assert part in ['trainval'], \
"Only trainval part of the combined dataset is available now."
im_dir = ospeu('~/Dataset/market1501_cuhk03_duke/trainval_images')
partition_file = ospeu('~/Dataset/market1501_cuhk03_duke/partitions.pkl')
After modification, install the package again python setup.py install --record installed_files.txt
.
Datasets used in this project all follow the standard evaluation protocol of Market1501, using CMC and mAP metric. According to open-reid, the setting of CMC is as follows
# In file aligned_reid/tri_loss/dataset/__init__.py
cmc_kwargs = dict(separate_camera_set=False,
single_gallery_shot=False,
first_match_break=True)
To play with different CMC options, you can modify it accordingly.
# In open-reid's reid/evaluators.py
# Compute all kinds of CMC scores
cmc_configs = {
'allshots': dict(separate_camera_set=False,
single_gallery_shot=False,
first_match_break=False),
'cuhk03': dict(separate_camera_set=True,
single_gallery_shot=True,
first_match_break=False),
'market1501': dict(separate_camera_set=False,
single_gallery_shot=False,
first_match_break=True)}
Training log and saved model weights can be downloaded from Google Drive or BaiduYun. Specify (1) an experiment directory for saving testing log and (2) the path of the downloaded model_weight.pth
in the following command.
python script/tri_loss/train.py \
-d '(0,)' \
--dataset market1501 \
--normalize_feature false \
-glw 1 \
-llw 0 \
-idlw 0 \
--only_test true \
--exp_dir SPECIFY_AN_EXPERIMENT_DIRECTORY_HERE \
--model_weight_file THE_DOWNLOADED_MODEL_WEIGHT_FILE
You can also train it by yourself. The following command performs training and testing automatically.
python script/tri_loss/train.py \
-d '(0,)' \
-r 1 \
--dataset market1501 \
--ids_per_batch 32 \
--ims_per_id 4 \
--normalize_feature false \
-gm 0.3 \
-glw 1 \
-llw 0 \
-idlw 0 \
--base_lr 2e-4 \
--lr_decay_type exp \
--exp_decay_at_epoch 151 \
--total_epochs 300
Training log and saved model weights can be downloaded from Google Drive or BaiduYun. Specify (1) an experiment directory for saving testing log and (2) the path of the downloaded model_weight.pth
in the following command.
python script/tri_loss/train.py \
-d '(0,)' \
--dataset market1501 \
--normalize_feature false \
-glw 1 \
-llw 0 \
-idlw 0 \
--only_test true \
--exp_dir SPECIFY_AN_EXPERIMENT_DIRECTORY_HERE \
--model_weight_file THE_DOWNLOADED_MODEL_WEIGHT_FILE
You can also train it by yourself. The following command performs training and testing automatically. Two ResNet-50 models are trained simultaneously with mutual loss on global distance. The following example uses GPU 0 and 1. NOTE the difference between train_ml.py
and train.py
when specifying GPU ids. The format is different. Details in code.
python script/tri_loss/train_ml.py \
-d '((0,), (1,))' \
-r 1 \
--num_models 2 \
--dataset market1501 \
--ids_per_batch 32 \
--ims_per_id 4 \
--normalize_feature false \
-gm 0.3 \
-glw 1 \
-llw 0 \
-idlw 0 \
-pmlw 0 \
-gdmlw 1 \
-ldmlw 0 \
--base_lr 2e-4 \
--lr_decay_type exp \
--exp_decay_at_epoch 151 \
--total_epochs 300
During training, you can run the TensorBoard and access port 6006
to watch the loss curves etc. E.g.
# Modify the path for `--logdir` accordingly.
tensorboard --logdir YOUR_EXPERIMENT_DIRECTORY/tensorboard
For more usage of TensorBoard, see the website and the help:
tensorboard --help
Test with CentOS 7, Intel(R) Xeon(R) CPU E5-2618L v3 @ 2.30GHz, GeForce GTX TITAN X.
Note that the following time consumption is not gauranteed across machines, especially when the system is busy.
For following settings
- ResNet-50
identities_per_batch = 32
,images_per_identity = 4
,images_per_batch = 32 x 4 = 128
- image size
h x w = 256 x 128
it occupies ~9600MB GPU memory. For mutual learning that involves N
models, it occupies N
GPUs each with ~9600MB GPU memory.
If not having a GPU larger than 9600MB, you have to either decrease identities_per_batch
or use multiple GPUs.
Taking Market1501 as an example, it contains 31969
training images of 751
identities, thus 1 epoch = 751 / 32 = 24 iterations
. Each iteration takes ~1.08s, so each epoch ~27s. Training for 300 epochs takes ~2.25 hours.
For training with mutual loss, we use multi threads to achieve parallel network forwarding and backwarding. Each iteration takes ~1.46s, each epoch ~35s. Training for 300 epochs takes ~2.92 hours.
Local distance is implemented in a parallel manner, thus adding local loss does not increase training time obviously, even when local hard samples are searched from local distance (instead of from global distance).
Taking Market1501 as an example
- With
images_per_batch = 32
, extracting feature of whole test set (12936 images) takes ~80s. - Computing query-gallery global distance, the result is a
3368 x 15913
matrix, ~2s - Computing query-gallery local distance, the result is a
3368 x 15913
matrix, ~140s - Computing CMC and mAP scores, ~15s
- Re-ranking requires computing query-query distance (a
3368 x 3368
matrix) and gallery-gallery distance (a15913 x 15913
matrix, most time-consuming), ~80s for global distance, ~800s for local distance