This model is a cascade of multiple networks for predicting video frames. The input can be an early fusion of different visual modalities (depth and RGB).
Note Please refer to my bachelor's thesis for details: Evaluating multi-stream networks for self-supervised representation learning
- Install YASSMLTK from https://git.tu-berlin.de/cvrs/mltk
- Clone this project
- Go to the experiment folder e.g.
cd <PATH_TO_THIS_REPO>/example
- Edit the
config.yml
accordingly - Create the
DATA_DIR
if necessary, and go into itcd <DATA_DIR>
- Make sure you have at least 35GB free space left
- Download the example dataset
carla/default3_small
. This is a small dataset similar thecarla/dataset4
which was used in the thesis.mkdir -p downloads/manual/carla/default3_small cd downloads/manual/carla/default3_small/ # download them manually or with e.g. wget (8GB) wget https://tubcloud.tu-berlin.de/s/BMp8ZmZi3S3mxbq/download -O params.zip wget https://tubcloud.tu-berlin.de/s/mzGJB8wZRDCYTwa/download -O Town01_Opt.zip wget https://tubcloud.tu-berlin.de/s/J73sPnacQKFgttt/download -O Town10HD_Opt.zip
- Go back to the experiment folder e.g.
cd <PATH_TO_THIS_REPO>/example
- Run the experiment with mltk e.g. (see documentation of YASSMLTK for parameters)
# this will download the docker images (12GB), extract the downloaded zips (14GB), train and evaluate the experiment python -m yassmltk.run <PATH_TO_THIS_REPO>/example
- All evaluation results are saved in
<PATH_TO_THIS_REPO>/example/eval
. E.g. the evaluation metrics inmetrics.yml
- Use
tensorboard --logdir .
to view the training curves and to track the training process
If you want to use carla/default4
(112GB raw data, 44GB TFRecords) instead, you need to generate it first with lnschroeder/carla-dataset-generator.
The YASSMLTK tool first calls train()
, then evaluate()
in the src.models.srivastava
module.
As a side note: Srivastava is the author of the composite model on which our model is based on. See: https://arxiv.org/abs/1502.04681.