Skip to content

RouteRL is a multi-agent reinforcement learning framework for modeling and simulating the collective route choices of humans and autonomous vehicles.

License

Notifications You must be signed in to change notification settings

COeXISTENCE-PROJECT/RouteRL

Repository files navigation

RouteRL

Multi-Agent Reinforcement Learning framework for modeling and simulating the collective route choices of humans and autonomous vehicles.

Tutorial Tests Online Documentation License PyPI Version Coverage Open in Code Ocean


RouteRL is a novel framework that integrates Multi-Agent Reinforcement Learning (MARL) with a microscopic traffic simulation, SUMO, facilitating the testing and development of efficient route choice strategies. The proposed framework simulates the daily route choices of driver agents in a city, including two types:

  • human drivers, emulated using discrete choice models,
  • and AVs, modeled as MARL agents optimizing their policies for a predefined objective.

RouteRL aims to advance research in MARL, traffic assignment problems, social reinforcement learning (RL), and human-AI interaction for transportation applications.

For overview see the paper and for more details, check the documentation online.

RouteRL usage and functionalities at a glance

The following is a simplified code of a possible standard MARL algorithm implementation via TorchRL.

env = TrafficEnvironment(seed=42, **env_params) # initialize the traffic environment

env.start() # start the connection with SUMO

for episode in range(human_learning_episodes): # human learning 
    env.step()

env.mutation() # some human agents transition to AV agents

collector = SyncDataCollector(env, policy, ...)  # collects experience by running the policy in the environment (TorchRL)

# training of the autonomous vehicles; human agents follow fixed decisions learned in their learning phase
for tensordict_data in collector:
        
    # update the policies of the learning agents
    for _ in range(num_epochs):
      subdata = replay_buffer.sample()
      loss_vals = loss_module(subdata)

      optimizer.step()
    collector.update_policy_weights_()

policy.eval() # set the policy into evaluation mode

# testing phase using the already trained policy
num_episodes = 100
for episode in range(num_episodes):
    env.rollout(len(env.machine_agents), policy=policy)
 
env.plot_results() # plot the results
env.stop_simulation() # stop the connection with SUMO

Documentation

Installation

  • Prerequisite: Make sure you have SUMO installed in your system. This procedure should be carried out separately, by following the instructions provided here.
  • Option 1: Install the latest stable version from PyPI:
      pip install routerl
    
  • Option 2: Clone this repository for latest version, and manually install its dependencies:
      git clone https://github.com/COeXISTENCE-PROJECT/RouteRL.git
      cd RouteRL
      pip install -r requirements.txt
    

Reproducibility capsule

We have an experiment script encapsulated in a CodeOcean capsule. This capsule allows demonstrating RouteRL's capabilities without the need for SUMO installation or dependency management.

  1. Visit the capsule link.
  2. Create a free CodeOcean account (if you don’t have one).
  3. Click Reproducible Run to execute the code in a controlled and reproducible environment.

Credits

RouteRL is part of COeXISTENCE (ERC Starting Grant, grant agreement No 101075838) and is a team work at Jagiellonian University in Kraków, Poland by: Ahmet Onur Akman and Anastasia Psarou (main contributors) supported by Grzegorz Jamroz, Zoltán Varga, Łukasz Gorczyca, Michał Hoffman and others, within the research group of Rafał Kucharski.