Skip to content

Chapter 16 Robot Learning in Simulation in book Deep Reinforcement Learning: example of Sawyer robot learning to reach the target with paralleled Soft Actor-Critic (SAC) algorithm, using PyRep for Sawyer robot simulation and game building. The environment is wrapped into OpenAI Gym format.

Notifications You must be signed in to change notification settings

wangjunyi9999/Chapter16-Robot-Learning-in-Simulation

 
 

Repository files navigation

Chapter 16: Robot Learning in Simulation (Project 4)

Description:

Example of Sawyer robot learning to reach the target with paralleled Soft Actor-Critic (SAC) algorithm, using PyRep for Sawyer robot simulation and game building. The environment is wrapped into OpenAI Gym format.

Dependencies:

Note:

  • The later version of V-REP 3.6.2 is renamed CoppeliaSim after verison 4.0.0, which may have some incompatible issues with PyRep during the process of this project, so we suggest to use V-REP 3.6.2 here and the maintained PyRep in our repository.
  • The official repository of PyRep is here, but we maintain a stable version here in our repository for supporting V-REP 3.6.2, please use the version we provide (here) for avoiding unnecessary incompatibility.

Contents:

  • arms/: object models of arms;
  • hands/: object models of grippers;
  • objects/: models of other objects in the scene;
  • scenes/: built scenes for Sawyer robot grasping;
  • figures/: figures for displaying;
  • model/: the model after training, and two pre-trained models with different reward functions;
  • data/: reward logs of with different reward functions;
  • sawyer_grasp_env_boundingbox.py: script of Sawyer robot grasping environment;
  • sac_learn.py: pralleled Soft Actor-Critic algorithm for solving Sawyer robot grasping task;
  • reward_log.npy: log of episode reward during training;
  • plot.ipynb: displaying the learning curves.

Usage:

  1. First check the environment can run successfully:

    $ python sawyer_grasp_env_boundingbox.py

    If it works properly with VRep called to run a scene, with Sawyer robot arm moving randomly, then go to next step; otherwise check the dependencies for necessary packages and versions.

  2. Run $ python sac_learn.py --train for training the policy

  3. Run $ python sac_learn.py --test for testing the trained policy, remember to change the trained_model_path, which is default to be the trained model we provided.

  4. The training process will provide a reward_log.npy file for recording the reward value during training, which can be displayed with $ jupyter notebook in a new terminal, choose plot.ipynband Shift+Enter to run the first cell, shown as follows:

Authors:

Zihan Ding, Yanhua Huang

Citing:

@misc{DeepReinforcementLearning-Chapter16-RobotLearninginSimulation,
  author = {Zihan Ding, Yanhua Huang},
  title = {Chapter16-RobotLearninginSimulation},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/deep-reinforcement-learning-book/Chapter16-Robot-Learning-in-Simulation}},
}

or

@book{deepRL-2020,
 title={Deep Reinforcement Learning: Fundamentals, Research, and Applications},
 editor={Hao Dong, Zihan Ding, Shanghang Zhang},
 author={Hao Dong, Zihan Ding, Shanghang Zhang, Hang Yuan, Hongming Zhang, Jingqing Zhang, Yanhua Huang, Tianyang Yu, Huaqing Zhang, Ruitong Huang},
 publisher={Springer Nature},
 note={\url{http://www.deepreinforcementlearningbook.org}},
 year={2020}
}

About

Chapter 16 Robot Learning in Simulation in book Deep Reinforcement Learning: example of Sawyer robot learning to reach the target with paralleled Soft Actor-Critic (SAC) algorithm, using PyRep for Sawyer robot simulation and game building. The environment is wrapped into OpenAI Gym format.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 75.9%
  • Python 24.1%