Skip to content

Pytorch implementation of "Safe Exploration in Continuous Action Spaces" [Dalal et al.]

License

Notifications You must be signed in to change notification settings

U70-TK/safe-explorer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Safe-Explorer

Introduction

This repository contains Pytorch implementation of paper "Safe Exploration in Continuous Action Spaces" [Dalal et al.] along with "Continuous Control With Deep Reinforcement Learning" [Lillicrap et al.]. Dalal et al. present a closed form analytically optimal solution to ensure safety in continuous action space. The proposed "safety layer", makes the smallest possible perturbation to the original action such that safety constraints are satisfied.

safety layer

Dalal et al. also propose two new domains BallND and Spaceship which are governed by first and second order dynamics respectively. In Spaceship domain agent receives a reward only on task completion, while BallND has continuous reward based distance from the target. Implementation of both of these tasks extend OpenAI gym's environment interface (gym.Env).

Setup

The code requires Python 3.6+ and is tested with torch 1.1.0. To install dependencies run,

pip install -r requirements.txt

Training

To obtain list of parameters and their default values run,

python -m safe_explorer.main --help

Train the model by simply running,

BallND

python -m safe_explorer.main --main_trainer_task ballnd

Spaceship

python -m safe_explorer.main --main_trainer_task spaceship

Monitor training with Tensorboard,

tensorboard --logdir=runs

Results

To be updated.

Acknowledgement

Some modifications in DDPG implementation are based OpenAI Spinning Up implement.

References

  • Lillicrap, Timothy P., et al. "Continuous control with deep reinforcement learning." arXiv preprint arXiv:1509.02971 (2015).

  • Dalal, Gal, et al. "Safe exploration in continuous action spaces." arXiv preprint arXiv:1801.08757 (2018).

About

Pytorch implementation of "Safe Exploration in Continuous Action Spaces" [Dalal et al.]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%