Skip to content

Adding a Dataset

Andrés Solís Montero edited this page Mar 22, 2016 · 9 revisions

Model free tracking datasets are not part of the repository itself. The datasets are stored in external servers in a compressed file to facilitate download. The datasets are downloaded while configuring and generating your vivaTracker project. Datasets are stored in tar.gz format for cmake cross-platform compatibility. Once downloaded, the datasets file are compared with their md5 hash value and extracted to the sequences folder.

A sequences.txt file is generated and contains the full path to the sequences folder in your system. This file is propagated and included into the Debug and Release folder of your built project to allow the user to store the datasets in a different path.

Inside the macros.txt file you will find the dataset urls and their corresponding md5 hashes. For example the vot2013 dataset will appear as:

SET(vot2013  http://www.site.uottawa.ca/research/viva/datasets/tracking/vot2013.tar.gz)
SET(md5_vot2013  7467447b0d533efcb458ab106c66497a)
...
SET(DATASETS ${vot2013} ${vot2014} ${vot2015})

Adding a Dataset from URL

Let's us assume we have a new dataset named dataset2016 and it's compressed and stored in our public server url: "http://mypublicserver.com/dataset2016.tar.gz". Its corresponding md5 checksum is 8567134b0d632eafb492ab107c69213e. To include the dataset2016 in your project build.

Inside the macros.txt file you will need to add the two lines:

SET(dataset2016  http://mypublicserver.com/dataset2016.tar.gz)
SET(md5_dataset2016  8567134b0d632eafb492ab107c69213e)

and modify the following line by adding the new dataset variable name

SET(DATASETS ${vot2013} ${vot2014} ${vot2015} ${dataset2016})

The datasets file structure

Each tar.gz dataset file contains a root folder with the name of the dataset and inside of it folders for each sequence belonging to the dataset. Inside the sequence folders we could optionally create a ""groundtruth.txt"" file annotating the sequence.

For example the vot2013 dataset containing 16 sequences stored in the compressed file vot2013.tar.gz will generate the following folder structure when extracted:

vot2013/      (root folder with dataset name)
    bicycle/         (subfolder with sequence name)
        000001.jpg   (sequence files....)
        000002.jpg
        ...
        groundtruth.txt (optional annotated groundtruth file)
    bolt/
      ....
    car/
    cup/
    david/
    diving/
    face/
    gymnastics/
    hand/
    iceskater/
    juice/
    jump/
    singer/
    sunshade/
    torus/
    woman/

Each sequence folder contains a list of alphabetically ordered image files and their ground truth annotations.

Ground truth file format

The ground truth file format could have one of two possible formats like in the VOT Challenges datasets.

  1. The compressed archive contains directories of images for each sequence and per frame annotations of the axis-aligned bounding box marking the object. The annotations are stored in a text file with the format:

x, y, width, height
where x and y are the pixel coordinates of the top left corner of the bounding box marking the object.

  1. The compressed archive contains directories of images for each sequence and per frame annotations of the rotated bounding box marking the object. The annotations are stored in a text file with the format:

x1, y1, x2, y2, x3, y3, x4, y4
where Xi and Yi are the coordinates of corner i of the bounding box in frame N in clock-wise order.

The frame number N is the N-th row in the text file.

The following

Moving the sequences folder