Dreaming in goal-conditioned environments

This code is building upon the model-based reinforcement learning agent Dreamer:

@inproceedings{Dreamer,
  author    = {Danijar Hafner and
               Timothy P. Lillicrap and
               Jimmy Ba and
               Mohammad Norouzi},
  title     = {Dream to Control: Learning Behaviors by Latent Imagination},
  booktitle = {8th International Conference on Learning Representations, {ICLR} 2020,
               Addis Ababa, Ethiopia, April 26-30, 2020},
  year      = {2020}
}

Specifically, it extends the implementation by first author Danijar Hafner to also work on robotic goal-conditioned OpenAI gym environments, such as FetchReach-v1. Read more about the goal-conditioned environment suite in this OpenAI blog post.

Instructions

Create the conda environment with all dependencies:

conda env create --file conda-env.yml
conda activate dreamer-env

This already installs the requirements in requirements.txt for you. Make sure you have MuJoCo set up on your machine beforehand (typically in /home/yourname/.mujoco/mujoco_200/). This is not done by conda for you! Besides the steps in the MoJoCo documentation, I also had to run the following commands inside of the conda env:

export LD_LIBRARY_PATH=$HOME/.mujoco/mujoco200/bin:$LD_LIBRARY_PATH
export MUJOCO_PY_MJPRO_PATH=$HOME/.mujoco/mujoco200/
export MUJOCO_PY_MJKEY_PATH=$HOME/.mujoco/mjkey.txt
sudo apt install libosmesa6-dev

Maybe libosmesa6-dev could be included inside conda-env.yml, but I was not able to find a suitable channel for it.

Train the agent using the specified logdir and robotics environment:

python3 dreamer.py --logdir ./logdir/fetch-reach-v1/dreamer/1 --task robotics_FetchReach-v1

Generate plots:

python3 plotting.py --indir ./logdir --outdir ./plots --xaxis step --yaxis test/return --bins 3e4

Start tensorboard with graphs and GIFs:

tensorboard --logdir ./logdir

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
.idea		.idea
baselines		baselines
logdir-her/dense		logdir-her/dense
logdir		logdir
plots/logdir		plots/logdir
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arm-cropped-bigger.png		arm-cropped-bigger.png
arm-cropped.gif		arm-cropped.gif
arm.png		arm.png
conda-env.yml		conda-env.yml
dreamer.py		dreamer.py
encdec-failed.gif		encdec-failed.gif
encdec-final.gif		encdec-final.gif
encdec-shy.gif		encdec-shy.gif
envs.png		envs.png
her.py		her.py
her_replay_buffer_mod.py		her_replay_buffer_mod.py
her_wrappers.py		her_wrappers.py
image.png		image.png
metrics.jsonl		metrics.jsonl
models.py		models.py
plotting.py		plotting.py
requirements.txt		requirements.txt
stuck.png		stuck.png
tools.py		tools.py
training.png		training.png
wrappers.py		wrappers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dreaming in goal-conditioned environments

Instructions

About

Releases

Packages

Languages

License

epistoteles/dreamer-goal-conditioned

Folders and files

Latest commit

History

Repository files navigation

Dreaming in goal-conditioned environments

Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages