Deep Deterministic Policy Gradient

This project is an implementation of a the Deterministic Deep Policy Gradient algorithm for solving a unity environment.

Required packages:

numpy
python (version 3.6)
pytorch
unityagents

Dependencies

The best way to run the code in this repository is to create a conda environment by following the instructions below.

Create (and activate) a new environment with Python 3.6.

Linux or Mac:

conda create --name drlnd python=3.6
source activate drlnd

Windows:

conda create --name drlnd python=3.6 
activate drlnd

Follow the instructions in this repository to perform a minimal install of OpenAI gym.
- Next, install the classic control environment group by following the instructions here.
- Then, install the box2d environment group by following the instructions here.
Clone this repository (if you haven't already!), and navigate to the python/ folder. Then, install several dependencies.

git clone https://github.com/jpruente92/RL_class_project_1
cd RL_class_project_1/python
pip install .

Use the dlrnd environment for starting the program.

Required files:

The unity exe file has to be inside this folder; here is an environment for Windows (64-bit) called "Reacher.exe" included. If you do not have Windows (64-bit), you can download the environment with one of the following links:

Reacher environment:

In the Reacher environment, 20 double-jointed arms can move to target locations indicated by bubbles. A reward of +0.1 is provided for each step that an agent's hand is in the goal location. The observation space consists of 33 variables corresponding to position, rotation, velocity, and angular velocities of the arms. Each action is a vector with four numbers, corresponding to torque applicable to two joints. Every entry in the action vector should be a number between -1 and 1.

Starting the program:

hyperparameters and settings for the algorithm can be changed in the file "hyperparameters.py". -> for viewing a trained agent set "LOAD" to True,"FILENAME_FOR_LOADING" to the name of the files of the model weights(without "reacher_20") and "ENV_TRAIN" to False. -> for training a new agent set "LOAD" to False and "ENV_TRAIN" to True and if you want to save the model weights, set "Save" to True and "FILENAME_FOR_SAVING" to the name for the weight files
after the hyperparameters are set, the file "main.py" has to be started
changes in all other files are not recommended.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.idea		.idea
Neural_networks		Neural_networks
Reacher_20_Windows_x86_64		Reacher_20_Windows_x86_64
Scores		Scores
__pycache__		__pycache__
README.md		README.md
Report.md		Report.md
agent.py		agent.py
ddpg.py		ddpg.py
hyperparameters.py		hyperparameters.py
main.py		main.py
networks.py		networks.py
replay_buffer.py		replay_buffer.py
unity-environment.log		unity-environment.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Deterministic Policy Gradient

Required packages:

Dependencies

Required files:

Reacher environment:

Starting the program:

About

Releases

Packages

Languages

jpruente92/RL-Deep-Deterministic-Policy-Gradient

Folders and files

Latest commit

History

Repository files navigation

Deep Deterministic Policy Gradient

Required packages:

Dependencies

Required files:

Reacher environment:

Starting the program:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages