Code release for Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification (TIP2021).
Python 3.6 with all of the pip install -r requirements.txt
packages including:
torch == 0.4.1
opencv-python
visdom
- Download the FGVC image data. Extract them to
data/cars/
,data/birds/
anddata/airs/
, respectively.
- Stanford-Cars (cars)
-/cars/
└─── car_ims
└─── 00001.jpg
└─── 00002.jpg
└─── ...
└─── cars_annos.mat
- CUB-200-2011 (birds)
-/birds/
└─── images.txt
└─── image_class_labels.txt
└─── train_test_split.txt
└─── images
└─── 001.Black_footed_Albatross
└─── Black_Footed_Albatross_0001_796111.jpg
└─── ...
└─── 002.Laysan_Albatross
└─── ...
- FGVC-Aircraft (airs)
-/airs/
└─── images
└─── 0034309.jpg
└─── 0034958.jpg
└─── ...
└─── variants.txt
└─── images_variant_trainval.txt
└─── images_variant_test.txt
- Preprocess images.
- For birds:
python utils/split_dataset/birds_dataset.py
- For cars:
python utils/split_dataset/cars_dataset.py
- For airs:
python utils/split_dataset/airs_dataset.py
Start:
python train.py --dataset {cars,airs,birds} --model {resnet50,vgg19} [options: --visualize]
to start training.
- For example, to train ResNet50 on Stanford-Cars:
python train.py --dataset cars --model resnet50
- Run
python train.py --help
to see full input arguments.
Visualize:
-
python -m visdom.server
to start visdom server. -
Visualize online attention masks and ROIs on
http://localhost:8097
.
Pretrained checkpoints with following settings are available on download link, with access code "kjqu".
Dataset | base model | accuracy(%) |
---|---|---|
CUB-200-2011 | resnet50 | 88.4 |
Stanford-Cars | resnet50 | 95.3 |
FGVC-Aircraft | resnet50 | 94.0 |
If you find this paper useful in your research, please consider citing:
@ARTICLE{9350209,
author={Y. {Ding} and Z. {Ma} and S. {Wen} and J. {Xie} and D. {Chang} and Z. {Si} and M. {Wu} and H. {Ling}},
journal={IEEE Transactions on Image Processing},
title={AP-CNN: Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification},
year={2021},
volume={30},
number={},
pages={2826-2836},
doi={10.1109/TIP.2021.3055617}}
Thanks for your attention! If you have any suggestion or question, you can leave a message here or contact us directly: