This repository contains the source code for the WFCRL multi-agent RL benchmark. The benchmark is done on the WFCRL environment suite.
All experiments are adapted from the CleanRL repository. Algorithms:
Algorithm | File | Description |
---|---|---|
IPPO | algos/baseline_ippo.py |
See Yu et. al |
MAPPO | algos/baseline_mappo.py |
See Yu et. al |
QMIX | algos/baseline_qmix.py |
See Rashid et. al |
IFAC | algos/ifac.py |
Simple online actor critic with Fourier Basis |
IQN | algos/idqn.py |
Simple independent DQN |
Install the dependencies:
pip install -r requirements
Launch an IPPO training experiment on the Dec_Ablaincourt_Floris
environment:
python algos/baseline_ippo.py --seed 1 --env_id Dec_Ablaincourt_Floris --total_timesteps 1000000
Evaluate it on the on the Dec_Ablaincourt_Fastfarm
environment:
mpiexec -n 1 python algos/eval.py --seed 0 --env_id Dec_Ablaincourt_Fastfarm --total_timesteps 10000 --pretrained_models path/to/run
Experiments for training and evaluation runs are in the scripts
folder. Add --scenario windrose
to train/eval on Wind Scenario II:
python algos/baseline_ippo.py --seed 1 --env_id Dec_Ablaincourt_Floris --total_timesteps 1000000 --scenario windrose
To track the experiment in Wandb, add your API key in an .env
file at the root of the folder:
WANDB_API_KEY=you_api_key