Interpretable Multi-Agent Reinforcement Learning for Traffic Signal Control: Influence Mechanism and Piecewise Linear Approximation
Abstract: Traffic signal control plays a crucial role in intelligent transportation systems, with cooperative control being challenging to implement but essential for its effectiveness. Many methods model multi-intersection traffic networks as grids and address the problem using multi-agent reinforcement learning (RL). Despite these existing studies, there is an opportunity to further enhance our understanding of the connectivity and globality of the traffic networks by capturing the spatiotemporal traffic information with efficient neural networks in deep RL. In this paper, we propose a novel multi-agent actor-critic framework based on an interpretable influence mechanism with a centralized learning and decentralized execution method. Specifically, we first construct an actor-critic framework, for which the piecewise linear neural network (PWLNN), named biased ReLU (BReLU), is used as the function approximator to obtain a more accurate and theoretically grounded approximation, and exhibits interpretability. Then, to model the relationships among agents in multi-intersection scenarios, we introduce an interpretable influence mechanism based on efficient hinging hyperplanes neural network (EHHNN), which derives weights by analysis of variance (ANOVA) decomposition among agents and extracts spatiotemporal dependencies of the traffic features. Finally, our proposed framework is validated on two synthetic traffic networks and a real road network to coordinate signal control between intersections, achieving lower traffic delays across the entire traffic network compared with benchmark performance. Note to Practitioners—The motivation of this paper is to develop an interpretable multi-intersection control framework that coordinates traffic signals across urban networks to mitigate congestion. When congestion occurs on the global traffic network, the strategy can collaborate with signal controllers at different intersections to minimize the vehicle waiting time on the road network and improve traffic flow. The strategy introduces an interpretable influence mechanism and an efficient training paradigm to better understand how decisions at one intersection affect traffic flow at upstream and downstream intersections, and thereby improve the effectiveness of the control outcomes. The developed strategy has been validated through co-simulation using Python and the Simulation of Urban Mobility (SUMO) traffic simulator in two synthetic traffic networks and a real road network to show its effectiveness.
External IDs:doi:10.1109/tase.2025.3618243
Loading