Skip to content

Repository for the International Challenge on Activity Recognition (ActivityNet) Dense Captioning

Notifications You must be signed in to change notification settings

skasai5296/actnetchallenge

Repository files navigation

actnetchallenge: Task 3 (Dense-Captioning Events in Videos)

Repo for activity net challenge 2019: Task 3 (Dense-Captioning Events in Videos) This repository provides a dense video captioning module for ActivityNet Captions Dataset.

TO-DO:

  • complete script for downloading ActivityNet videos
  • complete script for converting .mp4 videos to .jpg frames
  • write dataset class for ActivityNet Captions dataset
  • write baseline model for training
  • add optional training
  • add evaluation
  • add spatiotemporal attention
  • add proposal generation code
  • add testing code
  • add Transformer training
  • add BERT training
  • add character level training

Requirements

  • Python>=3.6
  • numpy
  • matplotlib
  • Pillow
  • accimage (optional, faster than Pillow)
  • pytorch>=1.0
  • torchvision>=0.2
  • pytube
  • torchtext (for spacy tokenizer and vocabulary)
  • nlg-eval (for evaluation metrics)
  • mkl-service (for theano, evaluation)

How to download ActivityNet Captions Dataset (ActivityNet Videos + Annotations)

  1. Download json file for ActivityNet dataset from here
  2. Modify download.sh and fix the command line argument for root directory to save the dataset. This path will be denoted $root_path.
  3. Make sure you have at least 300GB on your storage.
  4. Run bash download.sh to download .mp4 files.
  5. Download json files for ActivityNet Captions dataset from here
  6. Extract downloaded files to $root_path
  7. Run python utils/add_fps_into_activitynet_json.py -v ${video_dir} -s ${root_path}/train.json -o ${save_path}
  8. Run python utils/add_fps_into_activitynet_json.py -v ${video_dir} -s ${root_path}/val_1.json -o ${save_path}
  9. Run python utils/add_fps_into_activitynet_json.py -v ${video_dir} -s ${root_path}/val_2.json -o ${save_path}

How to convert video files to image files

  1. Make sure you have at least 1TB and enough Inodes left on your storage.
  2. Run python utils/mp42jpg.py ${video_dir} ${root_path}/frames activitynet --n_jobs=${number_of_workers}

Training procedures

  1. Run train.py with configurations (script is in train/trainscript.sh)

Testing procedures

  1. Proposal Generation is not implemented yet, so prepare a json file with proposals.
  2. Run test.py with configurations (script is in eval/eval.sh)

Samples

Transformer Captions

Transformer Captions

About

Repository for the International Challenge on Activity Recognition (ActivityNet) Dense Captioning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published