This project in the direction of Visual Object Tracking.
Paper: Learning Soft Mask Based Feature Fusion with Channel and Spatial Attention for Robust Visual Object Tracking
SCS-Siam architecture
-
Prerequisites: The project was built using python 3.7 and tested on Ubuntu 18.04. It was tested on a NVIDIA GeForce GTX 1080. Furthermore it requires PyTorch 1.0 or more.
-
Download the GOT-10k Dataset in http://got-10k.aitestunion.com/downloads and extract it on the folder of your choice, in my case it is
/media/mustansar/data/benchmarks/GOT-10k
(OBS: data reading is done in execution time, so if available extract the dataset in your SSD partition). -
Download the ImageNet VID Dataset in http://bvisionweb1.cs.unc.edu/ILSVRC2017/download-videos-1p39.php and extract it on the folder of your choice (OBS: data reading is done in execution time, so if available extract the dataset in your SSD partition). You can get rid of the test part of the dataset, since it has no Annotations.
-
In config.py script
root_dir_for_GOT_10k
,root_dir_for_VID and
androot_dir_for_OTB
change to your directory.
root_dir_for_GOT_10k = '/media/mustansar/data/benchmarks/GOT-10k' <-- change to your directory
root_dir_for_VID = '/media/mustansar/data/benchmarks/VID' <-- change to your directory
root_dir_for_OTB = '/media/mustansar/data/benchmarks/OTB2015' <-- change to your directory
- Run the train.py script:
python3 train.py
1 Run the test.py script:
python3 test.py
@article{fiaz2020learning,
title={Learning Soft Mask Based Feature Fusion with Channel and Spatial Attention for Robust Visual Object Tracking},
author={Fiaz, Mustansar and Mahmood, Arif and Jung, Soon Ki},
journal={Sensors},
volume={20},
number={14},
pages={4021},
year={2020},
publisher={Multidisciplinary Digital Publishing Institute}
}