The following sections will describe how one may reproduce the experiments from my Bachelor Thesis. It will list the required software and data and show how to use them step by step.
First of all one will need to install the software that ran the
experiments. Major parts of it are in the following git repository in
the form of Python and MATLAB scripts as well as the C++/ CUDA source
code of Caffe and MS-CNN: https://github.com/kingjin94/mscnn.git
The git repository is expected to be cloned into <ThesisRoot>/.
Now one has two choices, either compile MS-CNN oneself or preferably
one can build the Docker-image which runs MS-CNN. If one wants to
compile the source by himself please follow the description given by
MS-CNN in their git repository (https://github.com/zhaoweicai/mscnn)
and note that the needed source is already within <ThesisRoot>/.
If one chooses the Docker-image, first ensure that NVIDIA-Docker is
installed. The process is described here:
https://github.com/NVIDIA/nvidia-docker. After installing
NVIDIA-Docker please build the Docker-image with the dockerfile found in
<ThesisRoot>/docker/standalone/gpu/ and label it caffe:MSCNN.
Regardless of the chosen way to install the software additionally the
used datasets are needed:
-
KITTI Object Detection Evaluation 2012 [http://www.cvlibs.net/datasets/kitti/eval_object.php]: Left color images1 and Labels2
-
DitM [https://fcav.engin.umich.edu/sim-dataset/]: 200k Archive images3 or Subset used in this thesis4
-
DitM labels in KITTI format: <ThesisRoot>/data/DitM/label_2_DitM.tar.gz
-
Subsets (to be found under: <ThesisRoot>/data/{KITTI|DitM}/ImageSets/)
-
Windowfiles (to be found under: <ThesisRoot>/data/{KITTI|DitM}/window_files/)
-
VGG_16 [http://www.robots.ox.ac.uk/~vgg/research/very_deep/]: https://goo.gl/LCzWV8, to be saved in <models>/
Both datasets are expected to be in folders named <KITTIRoot>/ and <DitMRoot>/ respectively. The KITTI dataset is the blueprint for the folder structure, therefore some alterations to DitM have to be made:
-
Move the images from the DitM 200k or the smaller subset to the folder <DitMRoot>/image_2/
-
Extract the labels form DitM, found in <ThesisRoot>/data/DitM/label_2_DitM.tar.gz, to <DitMRoot>/label_2/
-
If one uses the 200k archive: Resize the images by first of all rescaling to 1280 width (with constant aspect ratio) and than cropping to a height of 384 by equally cutting off top and bottom. Furthermore convert them to .png
After the datasets are set up one has to adapt the evaluation scripts to the setup:
-
In <ThesisRoot>/examples/{KITTI|DitM}/evalFunc.m adapt line 20 to the appropriate ground-truth directory, which is <KITTIRoot>/training/label_2/ or <DitMRoot>/label_2/ respectively.
-
In <ThesisRoot>/examples/KITTI/image_size.m change the line 2 to <KITTIRoot>/
-
Compile evaluate_object.cpp in <ThesisRoot>/examples/kitti_result/eval/ with g++ or another C++ compiler; name the program evaluate_object
Training with the provided docker image is quiet easy. One has to first run the images with the data mounted at the right place. This is done by:
For KITTI:
sudo nvidia-docker run -ti -v <KITTIRoot>/:/home/data/KITTI/ \
-v <ThesisRoot>/examples/:/home/mscnn/examples/ \
-v <ThesisRoot>/data/:/home/mscnn/data/ \
-v <models>/:/home/mscnn/models/ caffe:MSCNN
For DitM:
sudo nvidia-docker run -ti -v <DitMRoot>/:/home/data/DitM \
-v <ThesisRoot>/examples/:/home/mscnn/examples/ \
-v <ThesisRoot>/data/:/home/mscnn/data/ \
-v <models>/:/home/mscnn/models/ caffe:MSCNN
Within the docker container one finds a folder structure similar to
<ThesisRoot>/. To run a training session go to
examples/{kitti_car|DitM}/<OneExperiment>/. Within there is
mscnn_train.sh which runs the training. Please adapt the script to the
number of GPUs by changing the –gpu tag to the IDs of the ones to be
used.
A log of the progress will be written to stdout, which one may want to
save to a log file. Lines with “Loss = ” within give a good sense of
training progress. After the training is done one will find a file named
mscnn_kitti_train_2nd_iter_25000.caffemodel within the folder which
are the trained weights.
The evaluation of the trained networks with the dockerimages is a two
stage process, allowing the more demanding part (inference) to be run on
a powerful server off-site if necessary. The docker-image handles
forwarding of images through the trained network and produces
intermediate output in form of .tar.gz archives. They may than be
analyzed with the MATLAB script provided in
<ThesisRoot>/examples/{kitti_car|DitM}/.
Therefore the following steps are to be performed:
-
Make the following directories and ensure about 10 GB of free disk space: <tmp>/ and <output>/
-
Start the container, depending on whether to evaluate on KITTI or DitM
KITTI:
sudo nvidia-docker run -ti -v <KITTIRoot>/:/home/data/KITTI/ \ -v <ThesisRoot>/examples/:/home/mscnn/examples/ \ -v <output>/:/home/output/ -v <tmp>/:/tmp/ caffe:MSCNN
DitM:
sudo nvidia-docker run -ti -v <DitMRoot>/:/home/data/DitM \ -v <ThesisRoot>/examples/:/home/mscnn/examples/ \ -v <output>/:/home/output/ -v <tmp>/:/tmp/ caffe:MSCNN
-
Within the container go to examples/{kitti_car|DitM}/ and run "python run_elementary_detection.py <FolderWithWantedNetwork> <FilenameOfWantedNetwork>". If you want to do crossevaluation you have to adapt Line 61f within run_elementary_detection.py to the correct image folder!
-
After the python script finished the output will have been saved to <output>/. Copy the archived results from output/ to <ThesisRoot>/examples/{kitti_car|DitM}/ outputDetection/, depending on whether evaluations are done on KITTI or DitM
-
Run evalOutputDetection within the right folder with MATLAB
-
The final result (recall-precision over the validation set) will be saved to <ThesisRoot>/examples/{kitti_car|DitM}/detections/<NameOfIntermediateResult>/plot/