SimBa: Simplicity Bias for Scaling Up Parameters in Deep RL

This is a repository of an official implementation of

Simba: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning by

Hojoon Lee, Dongyoon Hwang, Donghu Kim, Hyunseung Kim, Jun Jet Tai, Kaushik Subramanian,

Peter R. Wurman, Jaegul Choo, Peter Stone, Takuma Seno.

[Website] [Paper]

Overview

TL;DR

Stop worrying about algorithms, just change the network architecture to SimBa.

Method

SimBa is a network architecture designed for RL that avoids overfitting by embedding simplicity bias.

Results

When integrated SimBA with Soft Actor Critic (SAC), it matches the performance of state-of-the-art RL algorithms.

Getting strated

Docker

We provide a Dockerfile for easy installation. You can build the docker image by running

docker build . -t scale_rl .
docker run --gpus all -v .:/home/user/scale_rl -it scale_rl /bin/bash

Pip/Conda

If you prefer to install dependencies manually, start by installing dependencies via conda by following the guidelines.

# Use pip
pip install -e .

# Or use conda
conda env create -f deps/environment.yaml

(optional) Jax for GPU

pip install -U "jax[cuda12]==0.4.25" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
# If you want to execute multiple runs with a single GPU, we recommend to set this variable.
export XLA_PYTHON_CLIENT_PREALLOCATE=false

Mujoco

Please see installation instruction at MuJoCo.

# Additional environmental evariables for headless rendering
export MUJOCO_GL="egl"
export MUJOCO_EGL_DEVICE_ID="0"
export MKL_SERVICE_FORCE_INTEL="0"

(optional) Humanoid Bench

git clone https://github.com/joonleesky/humanoid-bench
cd humanoid-bench
pip install -e .

(optional) Myosuite

git clone --recursive https://github.com/joonleesky/myosuite
cd myosuite
pip install -e .

Example usage

We provide examples on how to train SAC agents with SimBa architecture.

To run a single experiment

python run.py

To benchmark the algorithm with all environments

python run_parallel.py \
    --task all \
    --device_ids <list of gpu devices to use> \
    --num_seeds <num_seeds> \
    --num_exp_per_device <number>

Scripts

An example script to collect DMC results using SAC with Simba:

bash scripts/sac_simba_dmc_em.sh
bash scripts/sac_simba_dmc_hard.sh
bash scripts/sac_simba_hbench.sh
bash scripts/sac_simba_myosuite.sh

Analysis

Please refer to analysis/benchmark.ipynb to analyze the experimental results provided in the paper.

Development

Configure development dependencies:

pip install -r deps/dev.requirements.txt
pre-commit install

License

This project is released under the Apache 2.0 license.

Citation

If you find our work useful, please consider citing our paper as follows:

@article{lee2024simba,
  title={SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning}, 
  author={Hojoon Lee and Dongyoon Hwang and Donghu Kim and Hyunseung Kim and Jun Jet Tai and Kaushik Subramanian and Peter R. Wurman and Jaegul Choo and Peter Stone and Takuma Seno},
  journal={arXiv preprint arXiv:2410.09754},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
analysis		analysis
configs		configs
deps		deps
docs		docs
results		results
scale_rl		scale_rl
scripts		scripts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
run.py		run.py
run_parallel.py		run_parallel.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SimBa: Simplicity Bias for Scaling Up Parameters in Deep RL

Overview

TL;DR

Method

Results

Getting strated

Docker

Pip/Conda

(optional) Jax for GPU

Mujoco

(optional) Humanoid Bench

(optional) Myosuite

Example usage

Scripts

Analysis

Development

License

Citation

About

Releases

Packages

Contributors 3

Languages

License

SonyResearch/simba

Folders and files

Latest commit

History

Repository files navigation

SimBa: Simplicity Bias for Scaling Up Parameters in Deep RL

Overview

TL;DR

Method

Results

Getting strated

Docker

Pip/Conda

(optional) Jax for GPU

Mujoco

(optional) Humanoid Bench

(optional) Myosuite

Example usage

Scripts

Analysis

Development

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages