SawyerSQL

#Installing Soft Q-Learning with Sawyer Model: Time for installation: ~ 1 hour Instructions written by Gerrit Schoettler, 04/12/2018 ([email protected]) Updated 14/05/2018

Install Ubuntu 16.04.4 LTS or 18.04 LTS http://releases.ubuntu.com/16.04/. Install the 64-bit PC (AMD) Version of Ubuntu 16.04 LTS Desktop, even if you have an Intel CPU. The complete installation has also Been performed successfully on Ubuntu 18.04 LTS 64-bit PC (AMD64): http://releases.ubuntu.com/18.04/ It is fine to install Ubuntu as a virtual machine in VMware Workstation Player 14. 30 GB of hard drive memory are recommended.
Update all: sudo apt-get update sudo apt-get upgrade sudo apt-get dist-upgrade
Get a Mujoco License from https://www.roboti.us/license.html . Download the Computer-id file for Linux, then: cd Downloads chmod +x getit_linux ./getid_linux
Install git (sudo apt install git) and pip (sudo apt install python-pip) and miniconda (Python 3.6, 64-bit https://conda.io/docs/user-guide/install/linux.html). Close terminal after installing miniconda.
Install Sublime http://tipsonubuntu.com/2017/05/30/install-sublime-text-3-ubuntu-16-04-official-way/ sudo apt-get update sudo apt-get install sublime-text
Start installation of Soft Q-Learning https://github.com/haarnoja/softqlearning Choose local installation, follow instructions, starting with cloning rllab. Choose “<installation_path_of_your_choice>” = “~/projectThesis”.

After creating conda environment: % ignore numpy update message Test example experiments:

cd ~/projectThesis/rllab
export PYTHONPATH=$(pwd):${PYTHONPATH}
cd ..
cd softqlearning
source activate sql

Run swimmer example for 1 minute to test installation (edit log_dir):
python ./examples/mujoco_all_sql.py --env=swimmer --log_dir="/home/gerrit/projectThesis/data/swimmer-experiment" Execute this command twice if the reinforcement learning does not start at the first time.

To visualize swimmer example (edit log_dir):
python ./scripts/sim_policy.py --max-path-length 1000 --speedup 100 /home/gerrit/projectThesis/data/swimmer-experiment/params.pkl

Implemenation of Sawyer model:
Download and save all files from rllab/vendor/mujoco_models. These files are named “mesh”, “sawyer_gripper_mocap.xml” and “sawyer_gripper_mocap_unedited.xml”.
Copy “mesh” and “sawyer_gripper_mocap.xml” to /home/gerrit/projectThesis/rllab/vendor/mujoco_models (edit path if project path is not /home/gerrit/projectThesis/)
Include Sawyer environment: Download and save the file from rllab/rllab/envs/mujoco. This file is named “sawyer_test_env_new.py”. Copy the file to ~/gerrit/projectThesis/rllab/rllab/envs/mujoco Download and save all files from softqlearning/examples. These files are named “mujoco_all_sql.py” and “reuse_qf_policy_sawyer.py”. Copy both files to ~/gerrit/projectThesis/softqlearning/examples and replace the old version of it. Just the SawyerTestEnv got added, no other changes were made. (edit path if project path is not /home/gerrit/projectThesis/)
Run Sawyer experiment:
cd ~/projectThesis/rllab
export PYTHONPATH=$(pwd):${PYTHONPATH}
cd ..
cd softqlearning
source activate sql
python ./examples/mujoco_all_sql.py --env=sawyer --log_dir="/home/gerrit/projectThesis/data/sawyer-experiment"
(Keep running for some iterations)

Visualize Sawyer: python ./scripts/sim_policy.py --max-path-length 1000 --speedup 100 /home/gerrit/projectThesis/data/sawyer-experiment/itr_0.pkl

(edit path if project path is not /home/gerrit/projectThesis/)

Reuse existing policy:

python ./examples/reuse_qf_policy_sawyer.py /home/gerrit/projectThesis/data/sawyer-experiment/itr_0.pkl

Files get saved in the data folder of softqlearning. From there we can copy them over to our saved files and rename to keep the order of iterations. This way we can keep learning after an interruption.

To record a mp4 video:
python ./record_video.py
Video saved to /home/gerrit/projectThesis/rllab/data/video/sawyer

This update included:

Big changes in the sawyer model -> Disc on peg task with a squared disc on a round peg,
Changes in the sawyer environment -> changes in the reward function (1-norm and rewarding correct orientation)

The next update will include:

Inclusion of the SAC algorithm (https://github.com/haarnoja/sac) -> this was difficult to install because it required changes in the environment.yml file (of SAC) before building the conda environment. Contact Gerrit Schoettler for previews this update.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
rllab		rllab
softqlearning		softqlearning
InstallingSoftQ+Sawyer.docx		InstallingSoftQ+Sawyer.docx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SawyerSQL

About

Releases

Packages

Languages

gerritschoe/SawyerSQL

Folders and files

Latest commit

History

Repository files navigation

SawyerSQL

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages