Guidelines for Parameter Selection in Traffic Light Control Methods Using Reinforcement Learning: Insights from Empirical Studies

Lang Qian, Peng Sun, Kun Yang, Jingyu Zhang, Azzedine Boukerche, Liang Song

Published: 2024, Last Modified: 07 Nov 2025IWQoS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The ever-changing traffic dynamics make the traditional traffic signal control methods unable to adapt to the environment. Meanwhile, deep reinforcement learning (DRL) has the property of interacting with the environment and adapting to changes in the environment. Therefore, in recent years, researchers have usually solved traffic signal control (TSC) problems through DRL methods. They have not only improved the design of neural networks, but also improved the ability of models to understand traffic conditions and learn corresponding task requests by designing different states and rewards. However, although the existing TSC algorithms based on DRL have proposed many well-designed states and reward strategies, which combinations of states and rewards should be adopted in practice to achieve the performance margin of models remains a question that researchers are seeking the answer to. Therefore, we introduce a general simulation platform to test and compare experimental performance under different combinations of states and rewards. Specifically, we test and analyze the experimental effects under different combinations of multiple traffic states and rewards through various TSC methods with a set of unified model settings. We further design and test some new state representations and reward strategies based on more detailed traffic information. The test results show that when researchers design the state and reward, refining the traffic state like vehicle running condition and making the state and reward match can make the experimental performance better than other combinations in most cases. We hope these results have some implications for the state and reward choice when researchers conduct experiments on TSC problem or other traffic decision management problems.

External IDs:dblp:conf/iwqos/QianSYZBS24