DRL Navigation - Banana Environment

This project applies Deep Reinforcement Learning on the Unity ML agents Banana environment.

More information on the algorithm, NN architecture and hyper-parameters can be found in this report.

The Environment

The environment consists of a closed room containing yellow and blue bananas.

The goal is to find yellow bananas and avoid blue bananas. It is an episodic environment, the agent gets a fixed number of steps in which to maximise its reward.

The environment is considered to be solved if an agent gets an average reward of at least 13 over 100 episodes.

Rewards

A reward of +1 is earned by catching a yellow banana. A penalty of -1 is given for catching a blue banana. Reward at all other times is 0.

This embodies the goal of catching as many yellow bananas as possible and avoiding the blue ones.

State space

The state space is continuous. It consists of vectors of size 37, specifying the agent's velocity and a ray-traced representation of the agent's local field of vision. It specifies presence of any objects under a number of fixed angles in front of the agent.

Action space

The action space is discrete and consists of four options:

go forward (0)
go backward (1)
go left (2)
go right (3)

Getting started

Installation

The project requires Python 3, PyTorch 0.4, the Unity ML API and the Unity ML environment which you can unzip in this project's bin directory (for Mac OS) or download from Udacity:

To install the requirements, first create an Anaconda environment (or another virtual env of your choice) for the project using python 3.6: conda create --name env_name python=3.6 -y

Activate the environment: conda activate env_name

Then go to the project's python directory and install the requirements in your environment: pip install .

Make sure the Unity environment is present in the bin/ directory and the corresponding name has been set in the ENV_APP constant in config.py.

Running the project

The project is run from the command line. There are two entry points. One trains an agent from scratch, the other shows you an agent performing a single episode.

1. `train.py` - Training an agent

To train a new agent, run the train.py script. It currently only supports training a DQN agent so no command line arguments are necessary. (Training parameters are set in config.py and the agent/factory.py module.)

When the environment is solved, the training script saves a checkpoint in the saved_models directory. During training, a checkpoint is saved every 100 iterations as wel. Both can be loaded with the watch script (see next point).

2. `watch.py` - Watching an agent

To watch an agent perform a single episode, run the watch.py script and specify which agent you would like to see using the --agent option. Available choices are:

random: shows a perfectly stupid agent.
dqn_pretrained: shows a pre-trained agent.
dqn_checkpoint: shows the last saved checkpoint reached during training.
dqn_solved: shows the last solution reached by the training script.

Example: python watch.py --agent=dqn_pretrained

This project is part of my Udacity DRL Nanodegree

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
agent		agent
assets		assets
bin		bin
python		python
saved_models		saved_models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Report.md		Report.md
config.py		config.py
train.py		train.py
watch.py		watch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DRL Navigation - Banana Environment

The Environment

Rewards

State space

Action space

Getting started

Installation

Running the project

1. `train.py` - Training an agent

2. `watch.py` - Watching an agent

About

Releases

Packages

Languages

License

jeroenmoons/drl_banana

Folders and files

Latest commit

History

Repository files navigation

DRL Navigation - Banana Environment

The Environment

Rewards

State space

Action space

Getting started

Installation

Running the project

1. train.py - Training an agent

2. watch.py - Watching an agent

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `train.py` - Training an agent

2. `watch.py` - Watching an agent

Packages