Playing Atari Breakout Game with Reinforcement Learning ( Deep Q Learning )
This project follows the description of the Deep Q Learning algorithm described in this paper.
- Python 3.x
- Numpy
- OpenCV-Python
- PyGame
- PyTorch
git clone https://github.com/SnnGnc/Deep-Q-Network-AtariBreakoutGame.git
cd brekout
- To train the game;
python dqn.py train
,- To test the pre-trained version,
python dqn.py test
I highly recommend to read this Demystifying Deep Reinforcement Learning who are curious about reinforcement learning.
"Working directly with raw Atari frames, which are 210 × 160 (in our case it depends on pygame screen) pixel images with a 128 color palette, can be computationally demanding, so we apply a basic preprocessing step aimed at reducing the input dimensionality. The raw frames are preprocessed by first converting their RGB representation to gray-scale and down-sampling it to a 84×84 image.As input Q-Network is preprocessing to the last 4 frames of a history and stacks them to produce the input to the Q-function.This process can be visualized as the following figure:
And convert these images to gray scale...
And send these into the Q-Network.
So what we have done;
- Take last 4 frames
- Resize images to 84x84
- Convert frames to gray-scale
- Stack them 84x84x4 input array and send them into the Q-Network.
The input to the neural network consists is an 84 × 84 × 4 image produced by φ. The first hidden layer convolves 32 8 × 8 filters with stride 4 with the input image and applies a rectifier nonlinearity. The second hidden layer convolves 64 4 × 4 filters with stride 2, again followed by a rectifier nonlinearity.The third hidden layer is fully-connected and consists of 7x7x64 input with 512 output,followed by a rectifier nonlinearity(input tensor is flattened). The final hidden layer is fully-connected and consists of 512 rectifier units. The output layer is a fully-connected linear layer with a single output for each valid action. The number of valid actions are 1 for left and 0 for right action.The architecture of the network is shown in the figure below:(Coming...)
Any contribution is welcome.