Optimizing Traffic Signal Control Using LLM-Driven Reward Weight Adjustment in Reinforcement Learning
![]() |
![]() |
---|---|
Fig.1.D3QN-LLM Framework | Fig.2.Flow of the Proposed Methodology |
- This study utilizes a Large Language Model (LLM) to overcome the challenge of setting weights in multi-objective reward functions in reinforcement learning. By dynamically adjusting the reward function weights in real-time, the LLM enables the reinforcement learning agent to learn an optimal policy that maximizes rewards.
- The experimental environment consists of a single intersection, with the goal of minimizing Average Travel Time through traffic signal control.
- Dueling Double DQN (D3QN) is used as the reinforcement learning algorithm, while GPT-4o-mini is employed as the LLM for adjusting reward function weights.
- The proposed model is referred to as D3QN-LLM, and its framework is illustrated in Fig. 1. The detailed process flow of the proposed method is shown in Fig. 2.
- By utilizing an LLM to dynamically address the weight-setting challenge in reward function design, a key aspect of reinforcement learning, this study proposes a new research direction.
- Python, PyTorch (GPU Acceleration) - Deep learning & reinforcement learning implementation
- SUMO (Simulation of Urban Mobility) - Traffic simulation and vehicle data generation
- GPT-4o-mini (LLM API) - Lightweight language model for traffic situation analysis
- LangChain Framework - LLM prompt engineering and response control
- Docker - Containerized environment for deployment and reproducibility
- OS: Ubuntu 22.04.5 / Windows (WSL supported)
- Docker: Latest version recommended (tested with Docker 27.5.1)
- CUDA Support: PyTorch acceleration in an NVIDIA GPU environment (if available)
- SUMO Version: 1.20.0 or later recommended
- Python Environment: Python 3.10.12 or later (venv or Docker recommended)
📦D3QN-LLM
┣ 📂asset
┣ 📂d3qn_imgs
┣ 📂d3qn_models
┣ 📂eval_d3qn_imgs
┣ 📂eval_d3qn_txts
┣ 📂llm
┣ 📂logs
┣ 📂sumo
┃ ┣ 📂add
┃ ┣ 📂detectors
┃ ┣ 📂net
┃ ┣ 📂rou
┃ ┣ 📂trip
┣ 📜d3qn_agent.py
┣ 📜d3qn_tsc_main.py
┗ 📜tsc_env.py
The following research papers and presentations are related to this project.
If this research has been helpful, please cite the papers below.
- [Accepted] Sujeong Choi and Yujin Lim, “Optimizing Traffic Signal Control Using LLM-Driven Reward Weight Adjustment in Reinforcement Learning,” Journal of Information Processing Systems (Editing/Proofreading in progress, Expected publication: February 2025).
- [Oral Presentation]
최수정 and 임유진, "LLM을 이용한 강화학습기반 교차로 신호 제어," 한국정보처리학회 학술대회논문집, vol. 31, no. 2, pp. 672-675, 2024.
- B. P. Gokulan and D. Srinivasan, "Distributed geometric fuzzy multiagent urban traffic signal control," IEEE Transactions on Intelligent Transportation Systems, vol. 11, no. 3, pp. 714-727, Sep. 2010. DOI
- H. Ceylan and M. G. H. Bell, "Traffic signal timing optimisation based on genetic algorithm approach including drivers’ routing," Transportation Research Part B: Methodological, vol. 38, no. 4, pp. 329-342, 2004. DOI
- Sujeong Choi and Yujin Lim, "Reinforcement Learning-Based Traffic Signal Control Using Large Language Models," Annual Conference of KIPS, vol. 31, no. 2, pp. 672-675, 2024.
- G. Zheng, X. Zang, N. Xu, H. Wei, Z. Yu, V. Gayah, et al., "Diagnosing reinforcement learning for traffic signal control," arXiv preprint, 2019. DOI
- H. Lee, Y. Han, Y. Kim, and Y. H. Kim, "Effects analysis of reward functions on reinforcement learning for traffic signal control," PLoS ONE, vol. 17, no. 11, 2022. DOI
- S. Lai, Z. Xu, W. Zhang, H. Liu, and H. Xiong, "Large language models as traffic signal control agents: Capacity and opportunity," arXiv preprint, 2023. DOI
- A. Pang, M. Wang, M. O. Pun, C. S. Chen, and X. Xiong, "iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement," arXiv preprint, 2024. DOI
- P. A. Lopez, M. Behrisch, L. B. Walz, J. Erdmann, Y. P. Flotterod, R. Hilbrich, L. Lucken, J. Rummel, P. Wagner, and E. Wiessner, "Microscopic traffic simulation using SUMO," Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 2575-2582, 2018. DOI
- L. Da, M. Gao, H. Mei, and H. Wei, "Prompt to transfer: Sim-to-real transfer for traffic signal control with prompt learning," AAAI Conference on Artificial Intelligence, vol. 38, pp. 82–90, 2024. DOI
This project code was developed for personal research and experimental purposes and is stored in a public repository.
This code is not open-source and may not be used for commercial purposes, copied, or distributed without permission.
Copyright (c) 2025 suzzang