Skip to content

Commit

Permalink
[Docs] Examples and README updates
Browse files Browse the repository at this point in the history
Signed-off-by: Matteo Bettini <[email protected]>
  • Loading branch information
matteobettini committed Oct 5, 2023
1 parent 132e1b8 commit 3b03189
Show file tree
Hide file tree
Showing 21 changed files with 440 additions and 53 deletions.
168 changes: 120 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
![BenchMARL](https://drive.google.com/uc?export=view&id=15rSPUadQCXfsJq7G2UPor9f-VwzvQ8k7)

# BenchMARL
[![tests](https://github.com/facebookresearch/BenchMARL/actions/workflows/unit_tests.yml/badge.svg)](test)
[![Python](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10-blue.svg)](https://www.python.org/downloads/)
Expand Down Expand Up @@ -200,7 +202,8 @@ determine the training strategy. Here is a table with the currently implemented


**Tasks**. Tasks are scenarios from a specific environment which constitute the MARL
challange to solve. They differe based on many aspects, here is a table with the current environments in BenchMARL
challenge to solve.
They differ based on many aspects, here is a table with the current environments in BenchMARL

| Enviromnent | Tasks | Cooperation | Global state | Reward function |
|-------------|---------------------------------------|---------------------------|--------------|-------------------------------|
Expand All @@ -210,6 +213,12 @@ challange to solve. They differe based on many aspects, here is a table with the
| [MPE](https://github.com/openai/multiagent-particle-envs) | [TBC](benchmarl/conf/task/pettingzoo) | Cooperative + Competitive | Yes | Shared + Independent |
| [SISL](https://github.com/sisl/MADRL) | [TBC](benchmarl/conf/task/pettingzoo) | Cooperative | No | Shared |

> [!NOTE]
> BenchMARL uses the [TorchRL MARL API](https://github.com/pytorch/rl/issues/1463) for grouping agents.
> In competitive environments like MPE, for example, teams will be in different groups. Each group has its own loss,
> models, buffers, and so on. Parameter sharing options refer to sharing within the group. See the example on [creating
> a custom algorithm](examples/extending/custom_algorithm.py) for more info.
**Models**. Models are neural networks used to process data. They can be used as actors (policies) or,
when possible, as critics. We provide a set of base models (layers) and a SequenceModel to concatenate
different. All the models can be used with or without parameter sharing within an
Expand All @@ -228,103 +237,166 @@ And the ones that are _work in progress_


## Reporting and plotting
TBC

Reporting and plotting is compatible with [marl-eval](https://github.com/instadeepai/marl-eval).
If `experiment.create_json=True` (this is the default in the [experiment config](benchmarl/conf/experiment/base_experiment.yaml))
a file named `{experiment_name}.json` will be created in the experiment output folder with the format of [marl-eval](https://github.com/instadeepai/marl-eval).
You can load and merge these files using the utils in [eval_results](benchmarl/eval_results.py) to create beautiful plots of
your benchmarks.

[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/plotting)

![aggregate_scores](https://drive.google.com/uc?export=view&id=1-f3NolMSjsWppCSXv_DJcs_GUD_fv7vO)
![sample_efficiancy](https://drive.google.com/uc?export=view&id=1FK37EfiqD3AQXWlQj7HQCkQDRNe2TuLy)

## Extending
TBC
One of the core tenets of BenchMARL is allowing users to leverage the existing algorithm
and tasks implementations to benchmark their newly proposed solution.

For this reason we expose standard interfaces for [algorithms](benchmarl/algorithms/common.py), [tasks](benchmarl/environments/common.py) and [models](benchmarl/models/common.py).
To introduce your solution in the library, you just need to implement the abstract methods
exposed by these base classes which use objects from the [TorchRL](https://github.com/pytorch/rl) library.

Here is an example on how you can create a custom algorithm [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_algorithm.py).

Here is an example on how you can create a custom task [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_task.py).

Here is an example on how you can create a custom model [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_model.py).


## Configuring
As highlighted in the [run](#run) section, the project can be configured either
in the script itself or via [hydra](https://hydra.cc/docs/intro/).
We suggest to read the hydra documentation
to get familiar with all its functionalities.

Experiment configurations are in [`benchmarl/conf/config.yaml`](benchmarl/conf/config.yaml),
with the experiment hyperparameters in [`benchmarl/conf/experiment`](benchmarl/conf/experiment).


Running custom experiments is extremely simplified by the [Hydra](https://hydra.cc/) configurations.
The default configuration for the library is contained in the [`benchmarl/conf`](benchmarl/conf) folder.

The default configuration for the library is contained in the [`conf`](benchmarl/conf) folder.

To run an experiment, you need to select a task and an algorithm
When running an experiment you can override its hyperparameters like so
```bash
python hydra_run.py task=vmas/balance algorithm=mappo
```
You can run a set of experiments. For example like this
```bash
python hydra_run.py --multirun task=vmas/balance algorithm=mappo,maddpg,masac,qmix
python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.lr=0.03 experiment.evaluation=true experiment.train_device="cpu"
```

Experiment hyperparameters are loaded from [`benchmarl/conf/experiment/base_experiment.yaml`](benchmarl/conf/experiment/base_experiment.yaml)
into a dataclass [`ExperimentConfig`](benchmarl/experiment/experiment.py) defining their domain.
This makes it so that all and only the parameters expected are loaded with the right types.
You can also directly load them from a script by calling `ExperimentConfig.get_from_yaml()`.

Here is an example of overriding experiment hyperparameters from hydra
[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_experiment.sh) or from
a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_experiment.py).

### Algorithm

You will need to specify an algorithm when launching your hydra script.
You can override an algorithm configuration when launching BenchMARL.

```bash
python hydra_run.py algorithm=mappo
python benchmarl/run.py task=vmas/balance algorithm=masac algorithm.num_qvalue_nets=3 algorithm.target_entropy=auto algorithm.share_param_critic=true
```

Available ones and their configs can be found at [`conf/algorithm`](benchmarl/conf/algorithm).

We suggest to not modify the algorithms config when running your benchmarks in order to guarantee
reproducibility.
Available algorithms and their default configs can be found at [`benchmarl/conf/algorithm`](benchmarl/conf/algorithm).
They are loaded into a dataclass [`AlgorithmConfig`](benchmarl/algorithms/common.py), present for each algorithm, defining their domain.
This makes it so that all and only the parameters expected are loaded with the right types.
You can also directly load them from a script by calling `YourAlgorithmConfig.get_from_yaml()`.

### Task
Here is an example of overriding algorithm hyperparameters from hydra
[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_algorithm.sh) or from
a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_algorithm.py).

You will need to specify a task when launching your hydra script.

Available ones and their configs can be found at [`conf/task`](benchmarl/conf/task) and are sorted
in enviornment folders.
### Task

We suggest to not modify the tasks config when running your benchmarks in order to guarantee
reproducibility.
You can override a task configuration when launching BenchMARL.
However this is not recommended for benchmarking as tasks should have fixed version and parameters for reproducibility.

```bash
python hydra_run.py task=vmas/balance
python benchmarl/run.py task=vmas/balance algorithm=mappo task.n_agents=4
```

Available tasks and their default configs can be found at [`benchmarl/conf/task`](benchmarl/conf/task).
They are loaded into a dataclass [`TaskConfig`](benchmarl/environments/common.py), defining their domain.
Tasks are enumerations under the environment name. For example, `VmasTask.NAVIGATION` represents the navigation task in the
VMAS simulator. This allows autocompletion and seeing all available tasks at once.
You can also directly load them from a script by calling `YourEnvTask.TASK_NAME.get_from_yaml()`.

Here is an example of overriding task hyperparameters from hydra
[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_task.sh) or from
a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_task.py).

### Model

You can override the model configuration when launching BenchMARL.
By default an MLP model will be loaded with the default config.

Available models and their configs can be found at [`conf/model/layers`](benchmarl/conf/model/layers).
You can change it like so:

```bash
python hydra_run.py model=layers/mlp
python benchmarl/run.py task=vmas/balance algorithm=mappo model=layers/mlp model=layers/mlp model.layer_class="torch.nn.Linear" "model.num_cells=[32,32]" model.activation_class="torch.nn.ReLU"
```

Available models and their configs can be found at [`benchmarl/conf/model/layers`](benchmarl/conf/model/layers).
They are loaded into a dataclass [`ModelConfig`](benchmarl/models/common.py), defining their domain.
You can also directly load them from a script by calling `YourModelConfig.get_from_yaml()`.

Here is an example of overriding model hyperparameters from hydra
[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_model.sh) or from
a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_model.py).

#### Sequence model
To use the sequence model. Available layer names are in the [`conf/model/layers`](benchmarl/conf/model/layers) folder.
```bash
python hydra_run.py "model=sequence" "model.intermediate_sizes=[256]" "model/[email protected]=mlp" "model/[email protected]=mlp"
```
Adding a layer
```bash
python hydra_run.py "+model/[email protected]=mlp"
```
Removing a layer
```bash
python hydra_run.py "~model.layers.l2"
```
Configuring a layer
You can compose layers into a sequence model.
Available layer names are in the [`benchmarl/conf/model/layers`](benchmarl/conf/model/layers) folder.

```bash
python hydra_run.py "model.layers.l1.num_cells=[3]"
python benchmarl/run.py task=vmas/balance algorithm=mappo model=sequence "model.intermediate_sizes=[256]" "model/layers@model.layers.l1=mlp" "model/[email protected]=mlp" "+model/[email protected]=mlp" "model.layers.l3.num_cells=[3]"
```
Add a layer with `"+model/[email protected]=mlp"`.

Remove a layer with `"~model.layers.l2"`.

Configure a layer with `"model.layers.l1.num_cells=[3]"`.

Here is an example of creating a sequence model from hydra
[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_sequence_model.sh) or from
a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_sequence_model.py).

## Features

BenchMARL has several features:
- A test CI with test routines run for all simulators and algorithms
- Integration in the official TorchRL ecosystem for dedicated support


### Logging

BenchMARL is compatible with the [TorchRL loggers](https://github.com/pytorch/rl/tree/main/torchrl/record/loggers).
A list of logger names can be provided in the [experiment config](benchmarl/conf/experiment/base_experiment.yaml).
Example of available options are: `wandb`, `csv`, `mflow`, `tensorboard` or any other option available in TorchRL. You can specify the loggers
in the yaml config files or in the script arguments like so:
```bash
python hydra_run.py "experiment.loggers=[wandb]"
python benchmarl/run.py algorithm=mappo task=vmas/balance "experiment.loggers=[wandb]"
```

Additionally, you can specify a `create_json` argument which instructs the trainer to output a `.json` file in the
format specified by [marl-eval](https://github.com/instadeepai/marl-eval).
### Checkpointing

Experiments can be checkpointed every `experiment.checkpoint_interval` iterations.
Experiments will use an output folder for logging and checkpointing which can be specified in `experiment.save_folder`.
If this is left unspecified,
the default will be the hydra output folder (if using hydra) or (otherwise) the current directory
where the script is launched.
The output folder will contain a folder for each experiment with the corresponding experiment name.
Their checkpoints will be stored in a `"checkpoints"` folder within the experiment folder.
```bash
python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=3 experiment.checkpoint_interval=1 experiment.save_folder="/my/folder"
```

To load from a checkpoint, pass the absolute checkpoint file name to `experiment.restore_file`.
```bash
python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=6 experiment.restore_file="/my/folder/checkpoint/checkpoint_03.pt"
```

[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/checkpointing/reload_experiment.py)

### Checkpointing
TBC
### Callbacks
TBC
4 changes: 2 additions & 2 deletions benchmarl/eval_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@


def get_raw_dict_from_multirun_folder(multirun_folder: str) -> Dict:
return _load_and_merge_json_dicts(_get_json_files_from_multirun(multirun_folder))
return load_and_merge_json_dicts(_get_json_files_from_multirun(multirun_folder))


def _get_json_files_from_multirun(multirun_folder: str) -> List[str]:
Expand All @@ -34,7 +34,7 @@ def _get_json_files_from_multirun(multirun_folder: str) -> List[str]:
return files


def _load_and_merge_json_dicts(
def load_and_merge_json_dicts(
json_input_files: List[str], json_output_file: Optional[str] = None
) -> Dict:
def update(d, u):
Expand Down
51 changes: 51 additions & 0 deletions examples/checkpointing/reload_experiment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
import os
from pathlib import Path

from benchmarl.algorithms import MappoConfig
from benchmarl.environments import VmasTask
from benchmarl.experiment import Experiment, ExperimentConfig
from benchmarl.models.mlp import MlpConfig

if __name__ == "__main__":

experiment_config = ExperimentConfig.get_from_yaml()
# Save the experiment in the current folder
experiment_config.save_folder = Path(os.path.dirname(os.path.realpath(__file__)))
# Checkpoint at every iteration
experiment_config.checkpoint_interval = 1
# Run 3 iterations
experiment_config.n_iters = 3

task = VmasTask.BALANCE.get_from_yaml()
algorithm_config = MappoConfig.get_from_yaml()
model_config = MlpConfig.get_from_yaml()
critic_model_config = MlpConfig.get_from_yaml()
experiment = Experiment(
task=task,
algorithm_config=algorithm_config,
model_config=model_config,
critic_model_config=critic_model_config,
seed=0,
config=experiment_config,
)
experiment.run()

# Now we tell it where to restore from
experiment_config.restore_file = (
experiment.folder_name
/ "checkpoints"
/ f"checkpoint_{experiment_config.n_iters}.pt"
)
# The experiment will be saved in the ame folder as the one it is restoring from
experiment_config.save_folder = None
# Let's do 3 more iters
experiment_config.n_iters += 3

experiment = Experiment(
algorithm_config=algorithm_config,
model_config=model_config,
seed=0,
config=experiment_config,
task=task,
)
experiment.run()
2 changes: 2 additions & 0 deletions examples/checkpointing/reload_experiment.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=3 experiment.checkpoint_interval=1
python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=6 experiment.restore_file="/hydra/experiment/folder/checkpoint/checkpoint_03.pt"
30 changes: 30 additions & 0 deletions examples/configuring/configuring_algorithm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from benchmarl.algorithms import MasacConfig
from benchmarl.environments import VmasTask
from benchmarl.experiment import Experiment, ExperimentConfig
from benchmarl.models.mlp import MlpConfig

if __name__ == "__main__":

# Loads from "benchmarl/conf/algorithm/masac.yaml"
algorithm_config = MasacConfig.get_from_yaml()

# You can override from the script
algorithm_config.num_qvalue_nets = 3 # Use an ensemble of 3 Q value nets
algorithm_config.target_entropy = "auto" # Set target entropy to auto
algorithm_config.share_param_critic = True # Use parameter sharing in the critic

# Some basic other configs
experiment_config = ExperimentConfig.get_from_yaml()
task = VmasTask.BALANCE.get_from_yaml()
model_config = MlpConfig.get_from_yaml()
critic_model_config = MlpConfig.get_from_yaml()

experiment = Experiment(
task=task,
algorithm_config=algorithm_config,
model_config=model_config,
critic_model_config=critic_model_config,
seed=0,
config=experiment_config,
)
experiment.run()
1 change: 1 addition & 0 deletions examples/configuring/configuring_algorithm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
python benchmarl/run.py task=vmas/balance algorithm=masac algorithm.num_qvalue_nets=3 algorithm.target_entropy=auto algorithm.share_param_critic=true
30 changes: 30 additions & 0 deletions examples/configuring/configuring_experiment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from benchmarl.algorithms import MappoConfig
from benchmarl.environments import VmasTask
from benchmarl.experiment import Experiment, ExperimentConfig
from benchmarl.models.mlp import MlpConfig

if __name__ == "__main__":

# Loads from "benchmarl/conf/experiment/base_experiment.yaml"
experiment_config = ExperimentConfig.get_from_yaml()

# You can override from the script
experiment_config.lr = 0.03 # Change the learning rate
experiment_config.evaluation = True # Set evaluation to true
experiment_config.train_device = "cpu" # Change the training device

# Some basic other configs
task = VmasTask.BALANCE.get_from_yaml()
algorithm_config = MappoConfig.get_from_yaml()
model_config = MlpConfig.get_from_yaml()
critic_model_config = MlpConfig.get_from_yaml()

experiment = Experiment(
task=task,
algorithm_config=algorithm_config,
model_config=model_config,
critic_model_config=critic_model_config,
seed=0,
config=experiment_config,
)
experiment.run()
1 change: 1 addition & 0 deletions examples/configuring/configuring_experiment.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.lr=0.03 experiment.evaluation=true experiment.train_device="cpu"
Loading

0 comments on commit 3b03189

Please sign in to comment.