Skip to content

Commit

Permalink
[Fix] Remove PettinZoo pursuit task
Browse files Browse the repository at this point in the history
Signed-off-by: Matteo Bettini <[email protected]>
  • Loading branch information
matteobettini committed Oct 8, 2023
1 parent 1eaf349 commit 8bec460
Show file tree
Hide file tree
Showing 7 changed files with 26 additions and 79 deletions.
42 changes: 21 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,35 +189,35 @@ See the [run](#run) section for more information.
**Algorithms**. Algorithms are an ensemble of components (e.g., losss, replay buffer) which
determine the training strategy. Here is a table with the currently implemented algorithms in BenchMARL.

| Name | On/Off policy | Actor-critic | Full-observability in critic | Action compatibility | Probabilistic actor |
|----------------------------------------|---------------|--------------|------------------------------|-------------------------------|---------------------|
| [MAPPO](https://arxiv.org/abs/2103.01955) | On | Yes | Yes | Continuous + Discrete | Yes |
| [IPPO](https://arxiv.org/abs/2011.09533) | On | Yes | No | Continuous + Discrete | Yes |
| [MADDPG](https://arxiv.org/abs/1706.02275) | Off | Yes | Yes | Continuous | No |
| [IDDPG](benchmarl/algorithms/iddpg.py) | Off | Yes | No | Continuous | No |
| [MASAC](benchmarl/algorithms/masac.py) | Off | Yes | Yes | Continuous + Discrete | Yes |
| [ISAC](benchmarl/algorithms/isac.py) | Off | Yes | No | Continuous + Discrete | Yes |
| [QMIX](https://arxiv.org/abs/1803.11485) | Off | No | NA | Discrete | No |
| [VDN](https://arxiv.org/abs/1706.05296) | Off | No | NA | Discrete | No |
| [IQL](https://www.semanticscholar.org/paper/Multi-Agent-Reinforcement-Learning%3A-Independent-Tan/59de874c1e547399b695337bcff23070664fa66e) | Off | No | NA | Discrete | No |
| Name | On/Off policy | Actor-critic | Full-observability in critic | Action compatibility | Probabilistic actor |
|---------------------------------------------------------------------------------------------------------------------------------------------|---------------|--------------|------------------------------|-----------------------|---------------------|
| [MAPPO](https://arxiv.org/abs/2103.01955) | On | Yes | Yes | Continuous + Discrete | Yes |
| [IPPO](https://arxiv.org/abs/2011.09533) | On | Yes | No | Continuous + Discrete | Yes |
| [MADDPG](https://arxiv.org/abs/1706.02275) | Off | Yes | Yes | Continuous | No |
| [IDDPG](benchmarl/algorithms/iddpg.py) | Off | Yes | No | Continuous | No |
| [MASAC](benchmarl/algorithms/masac.py) | Off | Yes | Yes | Continuous + Discrete | Yes |
| [ISAC](benchmarl/algorithms/isac.py) | Off | Yes | No | Continuous + Discrete | Yes |
| [QMIX](https://arxiv.org/abs/1803.11485) | Off | No | NA | Discrete | No |
| [VDN](https://arxiv.org/abs/1706.05296) | Off | No | NA | Discrete | No |
| [IQL](https://www.semanticscholar.org/paper/Multi-Agent-Reinforcement-Learning%3A-Independent-Tan/59de874c1e547399b695337bcff23070664fa66e) | Off | No | NA | Discrete | No |


**Tasks**. Tasks are scenarios from a specific environment which constitute the MARL
challenge to solve.
They differ based on many aspects, here is a table with the current environments in BenchMARL

| Enviromnent | Tasks | Cooperation | Global state | Reward function |
|-------------|-------------------------------------|---------------------------|--------------|-------------------------------|
| [VMAS](https://github.com/proroklab/VectorizedMultiAgentSimulator) | [5](benchmarl/conf/task/vmas) | Cooperative + Competitive | No | Shared + Independent + Global |
| [SMACv2](https://github.com/oxwhirl/smacv2) | [15](benchmarl/conf/task/smacv2) | Cooperative | Yes | Global |
| [MPE](https://github.com/openai/multiagent-particle-envs) | [8](benchmarl/conf/task/pettingzoo) | Cooperative + Competitive | Yes | Shared + Independent |
| [SISL](https://github.com/sisl/MADRL) | [3](benchmarl/conf/task/pettingzoo) | Cooperative | No | Shared |
| Environment | Tasks | Cooperation | Global state | Reward function | Action space |
|--------------------------------------------------------------------|-------------------------------------|---------------------------|--------------|-------------------------------|-----------------------|
| [VMAS](https://github.com/proroklab/VectorizedMultiAgentSimulator) | [5](benchmarl/conf/task/vmas) | Cooperative + Competitive | No | Shared + Independent + Global | Continuous + Discrete |
| [SMACv2](https://github.com/oxwhirl/smacv2) | [15](benchmarl/conf/task/smacv2) | Cooperative | Yes | Global | Discrete |
| [MPE](https://github.com/openai/multiagent-particle-envs) | [8](benchmarl/conf/task/pettingzoo) | Cooperative + Competitive | Yes | Shared + Independent | Continuous + Discrete |
| [SISL](https://github.com/sisl/MADRL) | [2](benchmarl/conf/task/pettingzoo) | Cooperative | No | Shared | Continuous |

> [!NOTE]
> BenchMARL uses the [TorchRL MARL API](https://github.com/pytorch/rl/issues/1463) for grouping agents.
> In competitive environments like MPE, for example, teams will be in different groups. Each group has its own loss,
> models, buffers, and so on. Parameter sharing options refer to sharing within the group. See the example on [creating
> a custom algorithm](examples/extending/custom_algorithm.py) for more info.
> a custom algorithm](examples/extending/algorithm/custom_algorithm.py) for more info.
**Models**. Models are neural networks used to process data. They can be used as actors (policies) or,
when possible, as critics. We provide a set of base models (layers) and a SequenceModel to concatenate
Expand Down Expand Up @@ -257,11 +257,11 @@ For this reason we expose standard interfaces for [algorithms](benchmarl/algorit
To introduce your solution in the library, you just need to implement the abstract methods
exposed by these base classes which use objects from the [TorchRL](https://github.com/pytorch/rl) library.

Here is an example on how you can create a custom algorithm [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_algorithm.py).
Here is an example on how you can create a custom algorithm [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/algorithm).

Here is an example on how you can create a custom task [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_task.py).
Here is an example on how you can create a custom task [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/task).

Here is an example on how you can create a custom model [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/custom_model.py).
Here is an example on how you can create a custom model [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/model).


## Configuring
Expand Down
32 changes: 0 additions & 32 deletions benchmarl/conf/task/pettingzoo/pursuit.yaml

This file was deleted.

2 changes: 0 additions & 2 deletions benchmarl/environments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@


from .pettingzoo.multiwalker import TaskConfig as MultiwalkerConfig
from .pettingzoo.pursuit import TaskConfig as PursuitConfig
from .pettingzoo.simple_adverasary import TaskConfig as SimpleAdversaryConfig
from .pettingzoo.simple_crypto import TaskConfig as SimpleCryptoConfig
from .pettingzoo.simple_push import TaskConfig as SimplePushConfig
Expand All @@ -37,7 +36,6 @@
"vmas_transport_config": TransportConfig,
"vmas_wheel_config": WheelConfig,
"pettingzoo_multiwalker_config": MultiwalkerConfig,
"pettingzoo_pursuit_config": PursuitConfig,
"pettingzoo_waterworld_config": WaterworldConfig,
"pettingzoo_simple_adversary_config": SimpleAdversaryConfig,
"pettingzoo_simple_crypto_config": SimpleCryptoConfig,
Expand Down
4 changes: 3 additions & 1 deletion benchmarl/environments/pettingzoo/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

class PettingZooTask(Task):
MULTIWALKER = None
WATERWORLD = None
SIMPLE_ADVERSARY = None
SIMPLE_CRYPTO = None
SIMPLE_PUSH = None
Expand Down Expand Up @@ -42,6 +43,7 @@ def get_env_fun(
def supports_continuous_actions(self) -> bool:
if self in {
PettingZooTask.MULTIWALKER,
PettingZooTask.WATERWORLD,
PettingZooTask.SIMPLE_TAG,
PettingZooTask.SIMPLE_ADVERSARY,
PettingZooTask.SIMPLE_CRYPTO,
Expand Down Expand Up @@ -88,7 +90,7 @@ def has_state(self) -> bool:
def has_render(self, env: EnvBase) -> bool:
return True

def max_steps(self, env: EnvBase) -> bool:
def max_steps(self, env: EnvBase) -> int:
return self.config["max_cycles"]

def group_map(self, env: EnvBase) -> Dict[str, List[str]]:
Expand Down
21 changes: 0 additions & 21 deletions benchmarl/environments/pettingzoo/pursuit.py

This file was deleted.

2 changes: 1 addition & 1 deletion benchmarl/environments/smacv2/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def supports_discrete_actions(self) -> bool:
def has_render(self, env: EnvBase) -> bool:
return True

def max_steps(self, env: EnvBase) -> bool:
def max_steps(self, env: EnvBase) -> int:
return env.episode_limit

def group_map(self, env: EnvBase) -> Dict[str, List[str]]:
Expand Down
2 changes: 1 addition & 1 deletion benchmarl/environments/vmas/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def supports_discrete_actions(self) -> bool:
def has_render(self, env: EnvBase) -> bool:
return True

def max_steps(self, env: EnvBase) -> bool:
def max_steps(self, env: EnvBase) -> int:
return self.config["max_steps"]

def group_map(self, env: EnvBase) -> Dict[str, List[str]]:
Expand Down

0 comments on commit 8bec460

Please sign in to comment.