[Docs] Examples and README updates

Signed-off-by: Matteo Bettini <[email protected]>
facebookresearch · Oct 5, 2023 · 3b03189 · 3b03189
1 parent 132e1b8
commit 3b03189
Show file tree

Hide file tree

Showing 21 changed files with 440 additions and 53 deletions.
diff --git a/README.md b/README.md
@@ -1,3 +1,5 @@
+![BenchMARL](https://drive.google.com/uc?export=view&id=15rSPUadQCXfsJq7G2UPor9f-VwzvQ8k7)
+
 # BenchMARL
 [![tests](https://github.com/facebookresearch/BenchMARL/actions/workflows/unit_tests.yml/badge.svg)](test)
 [![Python](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10-blue.svg)](https://www.python.org/downloads/)
@@ -200,7 +202,8 @@ determine the training strategy. Here is a table with the currently implemented
 
 
 **Tasks**. Tasks are scenarios from a specific environment which constitute the MARL
-challange to solve. They differe based on many aspects, here is a table with the current environments in BenchMARL
+challenge to solve.
+They differ based on many aspects, here is a table with the current environments in BenchMARL
 
 | Enviromnent | Tasks                                 | Cooperation               | Global state | Reward function               | 
 |-------------|---------------------------------------|---------------------------|--------------|-------------------------------|
@@ -210,6 +213,12 @@ challange to solve. They differe based on many aspects, here is a table with the
 | [MPE](https://github.com/openai/multiagent-particle-envs)     | [TBC](benchmarl/conf/task/pettingzoo) | Cooperative + Competitive | Yes          | Shared + Independent          |   
 | [SISL](https://github.com/sisl/MADRL)    | [TBC](benchmarl/conf/task/pettingzoo)       | Cooperative               | No           | Shared                        |  
 
+> [!NOTE]  
+> BenchMARL uses the [TorchRL MARL API](https://github.com/pytorch/rl/issues/1463) for grouping agents.
+> In competitive environments like MPE, for example, teams will be in different groups. Each group has its own loss,
+> models, buffers, and so on. Parameter sharing options refer to sharing within the group. See the example on [creating
+> a custom algorithm](examples/extending/custom_algorithm.py) for more info.
+
 **Models**. Models are neural networks used to process data. They can be used as actors (policies) or, 
 when possible, as critics. We provide a set of base models (layers) and a SequenceModel to concatenate
 different. All the models can be used with or without parameter sharing within an 
@@ -228,103 +237,166 @@ And the ones that are _work in progress_
 
 
 ## Reporting and plotting
-TBC
+
+Reporting and plotting is compatible with [marl-eval](https://github.com/instadeepai/marl-eval). 
+If `experiment.create_json=True` (this is the default in the [experiment config](benchmarl/conf/experiment/base_experiment.yaml))
+a file named `{experiment_name}.json` will be created in the experiment output folder with the format of [marl-eval](https://github.com/instadeepai/marl-eval).
+You can load and merge these files using the utils in [eval_results](benchmarl/eval_results.py) to create beautiful plots of 
+your benchmarks.
+
+[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/plotting)
+
+![aggregate_scores](https://drive.google.com/uc?export=view&id=1-f3NolMSjsWppCSXv_DJcs_GUD_fv7vO)
+![sample_efficiancy](https://drive.google.com/uc?export=view&id=1FK37EfiqD3AQXWlQj7HQCkQDRNe2TuLy)
 
 ## Extending
-TBC
+One of the core tenets of BenchMARL is allowing users to leverage the existing algorithm
+and tasks implementations to benchmark their newly proposed solution.
+
+For this reason we expose standard interfaces for [algorithms](benchmarl/algorithms/common.py), [tasks](benchmarl/environments/common.py) and [models](benchmarl/models/common.py).
+To introduce your solution in the library, you just need to implement the abstract methods
+exposed by these base classes which use objects from the [TorchRL](https://github.com/pytorch/rl) library.
+
+Here is an example on how you can create a custom algorithm [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_algorithm.py).
+
+Here is an example on how you can create a custom task [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_task.py).
+
+Here is an example on how you can create a custom model [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_model.py).
 
 
 ## Configuring
+As highlighted in the [run](#run) section, the project can be configured either
+in the script itself or via [hydra](https://hydra.cc/docs/intro/). 
+We suggest to read the hydra documentation
+to get familiar with all its functionalities. 
+
 Experiment configurations are in [`benchmarl/conf/config.yaml`](benchmarl/conf/config.yaml),
 with the experiment hyperparameters in [`benchmarl/conf/experiment`](benchmarl/conf/experiment).
-
-
 Running custom experiments is extremely simplified by the [Hydra](https://hydra.cc/) configurations.
+The default configuration for the library is contained in the [`benchmarl/conf`](benchmarl/conf) folder.
 
-The default configuration for the library is contained in the [`conf`](benchmarl/conf) folder.
-
-To run an experiment, you need to select a task and an algorithm
+When running an experiment you can override its hyperparameters like so
 ```bash
-python hydra_run.py task=vmas/balance algorithm=mappo
-```
-You can run a set of experiments. For example like this
-```bash
-python hydra_run.py --multirun task=vmas/balance algorithm=mappo,maddpg,masac,qmix
+python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.lr=0.03 experiment.evaluation=true experiment.train_device="cpu"
 ```
 
+Experiment hyperparameters are loaded from [`benchmarl/conf/experiment/base_experiment.yaml`](benchmarl/conf/experiment/base_experiment.yaml)
+into a dataclass [`ExperimentConfig`](benchmarl/experiment/experiment.py) defining their domain.
+This makes it so that all and only the parameters expected are loaded with the right types.
+You can also directly load them from a script by calling `ExperimentConfig.get_from_yaml()`.
+
+Here is an example of overriding experiment hyperparameters from hydra 
+[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_experiment.sh) or from
+a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_experiment.py).
+
 ### Algorithm
 
-You will need to specify an algorithm when launching your hydra script.
+You can override an algorithm configuration when launching BenchMARL.
 
 ```bash
-python hydra_run.py algorithm=mappo
+python benchmarl/run.py task=vmas/balance algorithm=masac algorithm.num_qvalue_nets=3 algorithm.target_entropy=auto algorithm.share_param_critic=true
 ```
 
-Available ones and their configs can be found at [`conf/algorithm`](benchmarl/conf/algorithm).
-
-We suggest to not modify the algorithms config when running your benchmarks in order to guarantee
-reproducibility. 
+Available algorithms and their default configs can be found at [`benchmarl/conf/algorithm`](benchmarl/conf/algorithm).
+They are loaded into a dataclass [`AlgorithmConfig`](benchmarl/algorithms/common.py), present for each algorithm, defining their domain.
+This makes it so that all and only the parameters expected are loaded with the right types.
+You can also directly load them from a script by calling `YourAlgorithmConfig.get_from_yaml()`.
 
-### Task
+Here is an example of overriding algorithm hyperparameters from hydra 
+[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_algorithm.sh) or from
+a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_algorithm.py).
 
-You will need to specify a task when launching your hydra script.
 
-Available ones and their configs can be found at [`conf/task`](benchmarl/conf/task) and are sorted
-in enviornment folders.
+### Task
 
-We suggest to not modify the tasks config when running your benchmarks in order to guarantee
-reproducibility. 
+You can override a task configuration when launching BenchMARL.
+However this is not recommended for benchmarking as tasks should have fixed version and parameters for reproducibility.
 
 ```bash
-python hydra_run.py task=vmas/balance
+python benchmarl/run.py task=vmas/balance algorithm=mappo task.n_agents=4
 ```
 
+Available tasks and their default configs can be found at [`benchmarl/conf/task`](benchmarl/conf/task).
+They are loaded into a dataclass [`TaskConfig`](benchmarl/environments/common.py), defining their domain.
+Tasks are enumerations under the environment name. For example, `VmasTask.NAVIGATION` represents the navigation task in the
+VMAS simulator. This allows autocompletion and seeing all available tasks at once.
+You can also directly load them from a script by calling `YourEnvTask.TASK_NAME.get_from_yaml()`.
+
+Here is an example of overriding task hyperparameters from hydra 
+[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_task.sh) or from
+a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_task.py).
+
 ### Model
 
+You can override the model configuration when launching BenchMARL.
 By default an MLP model will be loaded with the default config.
-
-Available models and their configs can be found at [`conf/model/layers`](benchmarl/conf/model/layers).
+You can change it like so:
 
 ```bash
-python hydra_run.py model=layers/mlp
+python benchmarl/run.py task=vmas/balance algorithm=mappo model=layers/mlp model=layers/mlp model.layer_class="torch.nn.Linear" "model.num_cells=[32,32]" model.activation_class="torch.nn.ReLU"
 ```
 
+Available models and their configs can be found at [`benchmarl/conf/model/layers`](benchmarl/conf/model/layers).
+They are loaded into a dataclass [`ModelConfig`](benchmarl/models/common.py), defining their domain.
+You can also directly load them from a script by calling `YourModelConfig.get_from_yaml()`.
+
+Here is an example of overriding model hyperparameters from hydra 
+[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_model.sh) or from
+a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_model.py).
 
 #### Sequence model
-To use the sequence model. Available layer names are in the [`conf/model/layers`](benchmarl/conf/model/layers) folder.
-```bash
-python hydra_run.py "model=sequence" "model.intermediate_sizes=[256]" "model/[email protected]=mlp" "model/[email protected]=mlp" 
-```
-Adding a layer
-```bash
-python hydra_run.py "+model/[email protected]=mlp"
-```
-Removing a layer
-```bash
-python hydra_run.py "~model.layers.l2"
-```
-Configuring a layer
+You can compose layers into a sequence model.
+Available layer names are in the [`benchmarl/conf/model/layers`](benchmarl/conf/model/layers) folder.
+
 ```bash
-python hydra_run.py "model.layers.l1.num_cells=[3]"
+python benchmarl/run.py task=vmas/balance algorithm=mappo model=sequence "model.intermediate_sizes=[256]" "model/layers@model.layers.l1=mlp" "model/[email protected]=mlp" "+model/[email protected]=mlp" "model.layers.l3.num_cells=[3]"
 ```
+Add a layer with `"+model/[email protected]=mlp"`.
+
+Remove a layer with `"~model.layers.l2"`.
 
+Configure a layer with `"model.layers.l1.num_cells=[3]"`.
+
+Here is an example of creating a sequence model from hydra 
+[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_sequence_model.sh) or from
+a script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_sequence_model.py).
 
 ## Features
 
+BenchMARL has several features:
+- A test CI with test routines run for all simulators and algorithms
+- Integration in the official TorchRL ecosystem for dedicated support
+
+
 ### Logging
 
 BenchMARL is compatible with the [TorchRL loggers](https://github.com/pytorch/rl/tree/main/torchrl/record/loggers).
 A list of logger names can be provided in the [experiment config](benchmarl/conf/experiment/base_experiment.yaml).
 Example of available options are: `wandb`, `csv`, `mflow`, `tensorboard` or any other option available in TorchRL. You can specify the loggers
 in the yaml config files or in the script arguments like so:
 ```bash
-python hydra_run.py "experiment.loggers=[wandb]"
+python benchmarl/run.py algorithm=mappo task=vmas/balance "experiment.loggers=[wandb]"
 ```
 
-Additionally, you can specify a `create_json` argument which instructs the trainer to output a `.json` file in the
-format specified by [marl-eval](https://github.com/instadeepai/marl-eval).
+### Checkpointing
+
+Experiments can be checkpointed every `experiment.checkpoint_interval` iterations.
+Experiments will use an output folder for logging and checkpointing which can be specified in `experiment.save_folder`.
+If this is left unspecified,
+the default will be the hydra output folder (if using hydra) or (otherwise) the current directory 
+where the script is launched.
+The output folder will contain a folder for each experiment with the corresponding experiment name.
+Their checkpoints will be stored in a `"checkpoints"` folder within the experiment folder.
+```bash
+python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=3 experiment.checkpoint_interval=1 experiment.save_folder="/my/folder"
+```
+
+To load from a checkpoint, pass the absolute checkpoint file name to `experiment.restore_file`.
+```bash
+python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=6  experiment.restore_file="/my/folder/checkpoint/checkpoint_03.pt"
+```
+
+[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/checkpointing/reload_experiment.py)
 
-### Checkpointing 
-TBC
 ### Callbacks
 TBC
diff --git a/benchmarl/eval_results.py b/benchmarl/eval_results.py
@@ -22,7 +22,7 @@
 
 
 def get_raw_dict_from_multirun_folder(multirun_folder: str) -> Dict:
-    return _load_and_merge_json_dicts(_get_json_files_from_multirun(multirun_folder))
+    return load_and_merge_json_dicts(_get_json_files_from_multirun(multirun_folder))
 
 
 def _get_json_files_from_multirun(multirun_folder: str) -> List[str]:
@@ -34,7 +34,7 @@ def _get_json_files_from_multirun(multirun_folder: str) -> List[str]:
     return files
 
 
-def _load_and_merge_json_dicts(
+def load_and_merge_json_dicts(
     json_input_files: List[str], json_output_file: Optional[str] = None
 ) -> Dict:
     def update(d, u):

diff --git a/examples/checkpointing/reload_experiment.py b/examples/checkpointing/reload_experiment.py
@@ -0,0 +1,51 @@
+import os
+from pathlib import Path
+
+from benchmarl.algorithms import MappoConfig
+from benchmarl.environments import VmasTask
+from benchmarl.experiment import Experiment, ExperimentConfig
+from benchmarl.models.mlp import MlpConfig
+
+if __name__ == "__main__":
+
+    experiment_config = ExperimentConfig.get_from_yaml()
+    # Save the experiment in the current folder
+    experiment_config.save_folder = Path(os.path.dirname(os.path.realpath(__file__)))
+    # Checkpoint at every iteration
+    experiment_config.checkpoint_interval = 1
+    # Run 3 iterations
+    experiment_config.n_iters = 3
+
+    task = VmasTask.BALANCE.get_from_yaml()
+    algorithm_config = MappoConfig.get_from_yaml()
+    model_config = MlpConfig.get_from_yaml()
+    critic_model_config = MlpConfig.get_from_yaml()
+    experiment = Experiment(
+        task=task,
+        algorithm_config=algorithm_config,
+        model_config=model_config,
+        critic_model_config=critic_model_config,
+        seed=0,
+        config=experiment_config,
+    )
+    experiment.run()
+
+    # Now we tell it where to restore  from
+    experiment_config.restore_file = (
+        experiment.folder_name
+        / "checkpoints"
+        / f"checkpoint_{experiment_config.n_iters}.pt"
+    )
+    # The experiment will be saved in the ame folder as the one it is restoring from
+    experiment_config.save_folder = None
+    # Let's do 3 more iters
+    experiment_config.n_iters += 3
+
+    experiment = Experiment(
+        algorithm_config=algorithm_config,
+        model_config=model_config,
+        seed=0,
+        config=experiment_config,
+        task=task,
+    )
+    experiment.run()
diff --git a/examples/checkpointing/reload_experiment.sh b/examples/checkpointing/reload_experiment.sh
@@ -0,0 +1,2 @@
+python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=3 experiment.checkpoint_interval=1
+python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=6  experiment.restore_file="/hydra/experiment/folder/checkpoint/checkpoint_03.pt"
diff --git a/examples/configuring/configuring_algorithm.py b/examples/configuring/configuring_algorithm.py
@@ -0,0 +1,30 @@
+from benchmarl.algorithms import MasacConfig
+from benchmarl.environments import VmasTask
+from benchmarl.experiment import Experiment, ExperimentConfig
+from benchmarl.models.mlp import MlpConfig
+
+if __name__ == "__main__":
+
+    # Loads from "benchmarl/conf/algorithm/masac.yaml"
+    algorithm_config = MasacConfig.get_from_yaml()
+
+    # You can override from the script
+    algorithm_config.num_qvalue_nets = 3  # Use an ensemble of 3 Q value nets
+    algorithm_config.target_entropy = "auto"  # Set target entropy to auto
+    algorithm_config.share_param_critic = True  # Use parameter sharing in the critic
+
+    # Some basic other configs
+    experiment_config = ExperimentConfig.get_from_yaml()
+    task = VmasTask.BALANCE.get_from_yaml()
+    model_config = MlpConfig.get_from_yaml()
+    critic_model_config = MlpConfig.get_from_yaml()
+
+    experiment = Experiment(
+        task=task,
+        algorithm_config=algorithm_config,
+        model_config=model_config,
+        critic_model_config=critic_model_config,
+        seed=0,
+        config=experiment_config,
+    )
+    experiment.run()
diff --git a/examples/configuring/configuring_algorithm.sh b/examples/configuring/configuring_algorithm.sh
@@ -0,0 +1 @@
+python benchmarl/run.py task=vmas/balance algorithm=masac algorithm.num_qvalue_nets=3 algorithm.target_entropy=auto algorithm.share_param_critic=true
diff --git a/examples/configuring/configuring_experiment.py b/examples/configuring/configuring_experiment.py
@@ -0,0 +1,30 @@
+from benchmarl.algorithms import MappoConfig
+from benchmarl.environments import VmasTask
+from benchmarl.experiment import Experiment, ExperimentConfig
+from benchmarl.models.mlp import MlpConfig
+
+if __name__ == "__main__":
+
+    # Loads from "benchmarl/conf/experiment/base_experiment.yaml"
+    experiment_config = ExperimentConfig.get_from_yaml()
+
+    # You can override from the script
+    experiment_config.lr = 0.03  # Change the learning rate
+    experiment_config.evaluation = True  # Set evaluation to true
+    experiment_config.train_device = "cpu"  # Change the training device
+
+    # Some basic other configs
+    task = VmasTask.BALANCE.get_from_yaml()
+    algorithm_config = MappoConfig.get_from_yaml()
+    model_config = MlpConfig.get_from_yaml()
+    critic_model_config = MlpConfig.get_from_yaml()
+
+    experiment = Experiment(
+        task=task,
+        algorithm_config=algorithm_config,
+        model_config=model_config,
+        critic_model_config=critic_model_config,
+        seed=0,
+        config=experiment_config,
+    )
+    experiment.run()
diff --git a/examples/configuring/configuring_experiment.sh b/examples/configuring/configuring_experiment.sh
@@ -0,0 +1 @@
+python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.lr=0.03 experiment.evaluation=true experiment.train_device="cpu"
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=3 experiment.checkpoint_interval=1
		python benchmarl/run.py task=vmas/balance algorithm=mappo experiment.n_iters=6 experiment.restore_file="/hydra/experiment/folder/checkpoint/checkpoint_03.pt"
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		python benchmarl/run.py task=vmas/balance algorithm=masac algorithm.num_qvalue_nets=3 algorithm.target_entropy=auto algorithm.share_param_critic=true