Skip to content

Commit

Permalink
Merge pull request #53 from epignatelli/themes
Browse files Browse the repository at this point in the history
Implementing MiniGrid skin for RGB obs
  • Loading branch information
epignatelli authored Mar 8, 2024
2 parents 4b1d721 + 3868f4a commit b05a291
Show file tree
Hide file tree
Showing 70 changed files with 515 additions and 661 deletions.
119 changes: 90 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,72 +8,133 @@
**[Quickstart](#what-is-navix)** | **[Installation](#installation)** | **[Examples](#examples)** | **[Cite](#cite)**

## What is NAVIX?
NAVIX is [minigrid](https://github.com/Farama-Foundation/Minigrid) in JAX, **>1000x** faster with Autograd and XLA support.
You can see a superficial performance comparison [here](docs/performance.ipynb).
NAVIX is a JAX-powered reimplementation of [minigrid](https://github.com/Farama-Foundation/Minigrid). Key features:
- Performance Boost: NAVIX offers a **~>1000x** speed increase compared to the original Minigrid, enabling faster experimentation and scaling. You can see a preliminary performance comparison [here](docs/performance.py).
- XLA Compilation: Leverage the power of XLA to optimize NAVIX computations for your hardware (CPU, GPU, TPU).
- Autograd Support: Differentiate through environment transitions, opening up new possibilities such as learned world models.

The library is in active development, and we are working on adding more environments and features.
If you want join the development and contribute, please [open a discussion](https://github.com/epignatelli/navix/discussions/new?category=general) and let's have a chat!


## Installation
We currently support the OSs supported by JAX.
You can find a description [here](https://github.com/google/jax#installation).
#### Install JAX
Follow the official installation guide for your OS and preferred accelerator: https://github.com/google/jax#installation.

You might want to follow the same guide to install jax for your faviourite accelerator
(e.g. [CPU](https://github.com/google/jax#pip-installation-cpu),
[GPU](https://github.com/google/jax#pip-installation-gpu-cuda-installed-locally-harder), or
[TPU](https://github.com/google/jax#pip-installation-colab-tpu)
).

- ### Stable
Then, install the stable version of `navix` and its dependencies with:
#### Install NAVIX
```bash
pip install navix
```

- ### Nightly
Or, if you prefer to install the latest version from source:
Or, for the latest version from source:
```bash
pip install git+https://github.com/epignatelli/navix
```

## Examples

### XLA compilation
One straightforward use case is to accelerate the computation of the environment with XLA compilation.
For example, here we vectorise the environment to run multiple environments in parallel, and compile **the full training run**.

You can find a partial performance comparison with [minigrid](https://github.com/Farama-Foundation/Minigrid) in the [docs](docs/profiling.ipynb).

### Compiling a collection step
```python
import jax
import navix as nx
import jax.numpy as jnp


def run(seed)
env = nx.environments.Room(16, 16, 8)
def run(seed):
env = nx.make('MiniGrid-Empty-8x8-v0') # Create the environment
key = jax.random.PRNGKey(seed)
timestep = env.reset(key)
actions = jax.random.randint(key, (N_TIMESTEPS,), 0, 6)
actions = jax.random.randint(key, (N_TIMESTEPS,), 0, env.action_space.n)

def body_fun(timestep, action):
timestep = env.step(timestep, jnp.asarray(action))
timestep = env.step(action) # Update the environment state
return timestep, ()

return jax.lax.scan(body_fun, timestep, jnp.asarray(actions, dtype=jnp.int32))[0]
return jax.lax.scan(body_fun, timestep, actions)[0]

final_timestep = jax.jit(jax.vmap(run))(jax.numpy.arange(1000))
# Compile the entire training run for maximum performance
final_timestep = jax.jit(jax.vmap(run))(jnp.arange(1000))
```

### Compiling a full training run
```python
import jax
import navix as nx
import jax.numpy as jnp
from jax import random

def run_episode(seed, env, policy):
"""Simulates a single episode with a given policy"""
key = random.PRNGKey(seed)
timestep = env.reset(key)
done = False
total_reward = 0

while not done:
action = policy(timestep.observation)
timestep, reward, done, _ = env.step(action)
total_reward += reward

return total_reward

def train_policy(policy, num_episodes):
"""Trains a policy over multiple parallel episodes"""
envs = jax.vmap(nx.make, in_axes=0)(['MiniGrid-MultiRoom-N2-S4-v0'] * num_episodes)
seeds = random.split(random.PRNGKey(0), num_episodes)

# Compile the entire training loop with XLA
compiled_episode = jax.jit(run_episode)
compiled_train = jax.jit(jax.vmap(compiled_episode, in_axes=(0, 0, None)))

for _ in range(num_episodes):
rewards = compiled_train(seeds, envs, policy)
# ... Update the policy based on rewards ...

# Hypothetical policy function
def policy(observation):
# ... your policy logic ...
return action

# Start the training
train_policy(policy, num_episodes=100)
```

### Backpropagation through the environment
```python
import jax
import navix as nx
import jax.numpy as jnp
from jax import grad
from flax import struct


class Model(struct.PyTreeNode):
@nn.compact
def __call__(self, x):
# ... your NN here

Another use case it to backpropagate through the environment transition function, for example to learn a world model.
model = Model()
env = nx.environments.Room(16, 16, 8)

def loss(params, timestep):
action = jnp.asarray(0)
pred_obs = model.apply(timestep.observation)
timestep = env.step(timestep, action)
return jnp.square(timestep.observation - pred_obs).mean()

key = jax.random.PRNGKey(0)
timestep = env.reset(key)
params = model.init(key, timestep.observation)

gradients = grad(loss)(params, timestep)
```

TODO(epignatelli): add example.
## Join Us!

NAVIX is actively developed. If you'd like to contribute to this open-source project, we welcome your involvement! Start a discussion or open a pull request.

## Cite
If you use `navix` please consider citing it as:
If you use `navix` please cite it as:

```bibtex
@misc{pignatelli2023navix,
Expand Down
4 changes: 4 additions & 0 deletions assets/COPYRIGHT
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Copyright 2024 https://github.com/Farama-Foundation/Minigrid
The following images are under Apache 2.0 License as per https://github.com/Farama-Foundation/Minigrid/LICENSE.
A copy of the license is provided in the fileassets/LICENSE.

40 changes: 40 additions & 0 deletions assets/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
@@ -154,49 +194,23 @@
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS
Binary file added assets/sprites/ball_blue.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/ball_green.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/ball_grey.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/ball_purple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/ball_red.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/ball_yellow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/box_blue.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/box_green.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/box_grey.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/box_purple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/box_red.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/box_yellow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_closed_blue.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_closed_green.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_closed_grey.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_closed_purple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_closed_red.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_closed_yellow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_locked_blue.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_locked_green.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_locked_grey.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_locked_purple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_locked_red.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_locked_yellow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_open_blue.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/sprites/door_open_green.png
Binary file added assets/sprites/door_open_grey.png
Binary file added assets/sprites/door_open_purple.png
Binary file added assets/sprites/door_open_red.png
Binary file added assets/sprites/door_open_yellow.png
Binary file added assets/sprites/floor.png
Binary file added assets/sprites/goal.png
Binary file added assets/sprites/key_blue.png
Binary file added assets/sprites/key_green.png
Binary file added assets/sprites/key_grey.png
Binary file added assets/sprites/key_purple.png
Binary file added assets/sprites/key_red.png
Binary file added assets/sprites/key_yellow.png
Binary file added assets/sprites/lava.png
Binary file added assets/sprites/player_east.png
Binary file added assets/sprites/player_north.png
Binary file added assets/sprites/player_south.png
Binary file added assets/sprites/player_west.png
Binary file added assets/sprites/wall.png
138 changes: 0 additions & 138 deletions docs/performance.ipynb

This file was deleted.

Loading

0 comments on commit b05a291

Please sign in to comment.