A Flexible Cooperative MARL Method for Efficient Passage of an Emergency CAV in Mixed Traffic

Published: 01 Jan 2024, Last Modified: 13 May 2025IEEE Trans. Intell. Transp. Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Connected and autonomous vehicles offer the possibility to carry out control strategies, thus having great potential to improve traffic efficiency and road safety. The efficient passage of an emergency vehicle calls for the collaborative driving decision-making among multiple vehicles in a dynamically changing local area. However, existing work fails to efficiently adapt to dynamic and complex traffic conditions, thus cannot well solve the task. For better solution, we propose a flexible cooperative multi-agent reinforcement learning approach based on value function factorization, called Q-LSTM. Since the traffic environment is partially observable, the centralized training and decentralized execution paradigm is adopted to learn effective cooperative strategies for individual agents. To flexibly adapt to the changing neighborhood condition around the emergency vehicle, we introduce a long short-term memory network to decompose the learned global value function into local value function of each agent within the neighborhood, whose quantity and entities vary over time. To address the credit assignment problem and realize different roles of the emergency and regular vehicles, reward mechanism and the way agent-wise Q-networks update are well-designed. Extensive experiments are conducted on the Simulation of Urban MObility platform. Results show that our Q-LSTM outperforms state-of-the-art value-based MARL methods. Moreover, the robustness and adaptability of the Q-LSTM are verified in the cases of increased traffic density.
Loading