๐ง An Unsupervised Reinforcement Learning Pipeline for Video Frame Classification
๐ง This is a Proof of Concept Project ๐ง |
---|
๐ง Authors are not Responsible for Damages to Life and Property if Deployed ๐ง |
---|
The algorithm we use is inspired the works of Anand et. Al.
from 2020 at MILA Labs and Microsoft Research. We repurpose this for video frame classification.
- Inspired by human learning which is largely unsupervised, a state representation learning algorithm learns the high-level features from the image frame neither with labels with explicit rewards nor by modelling the pixels directly.
- As we work with frames of a video, our data is temporally consistent. Additionally, local consistency is also observed as some objects donโt move drastically over time. We exploit these structures to learn the representations directly.
Fig. 2 - (right) shows the contrastive task of learning the final discriminator. We use a bilinear model for calculation of the score function based on the output from the representation encoder below. The objective function of the discriminator assigns large values to positive examples and small values to negative examples by maximizing the given bound in the top equation.
This translates into maximizing the true positives while minimizing the mis predictions and false alarms.
Get the dataset from here and place it under datasets.
python runner.py --arch [cnn, dqn, usrl]
The trained weights will be stored in the root of the runner script.
python test.py
- CNN
- RL - DQN
- RL - USRL
- Live cam test script
- Creating a custom gym env
- Boilerplate for trainer scripts
- DQN Implementation
- Unsupervised State Representation Learning
- Project Inspiration
@article{anand2019unsupervised,
title={Unsupervised State Representation Learning in Atari},
author={Anand, Ankesh and Racah, Evan and Ozair, Sherjil and Bengio, Yoshua and Cot'e, Marc-Alexandre and Hjelm, R Devon},
journal={arXiv preprint arXiv:1906.08226},
year={2019}
}