A Multi-Agent Reinforcement Learning Based Control Method for CAVs in a Mixed Platoon

Yaqi Xu, Yan Shi, Xiaolu Tong, Shanzhi Chen, Yuming Ge

Published: 2024, Last Modified: 17 May 2025IEEE Trans. Veh. Technol. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the development of automatic driving technology and the Internet of Vehicles, platooning based on control of Connected and Autonomous Vehicles (CAVs) has become one of the most promising approach to improve traffic efficiency and safety. However, CAVs and Human-Driven Vehicles (HDVs) will coexist for a long period, and the inherent randomness of HDVs may cause traffic flow oscillations within the mixed platoon. This paper proposes a mixed platoon control strategy, Relative Position Encoding Multi Actor Attention Critic (RPE-MAAC), for CAVs operating in mixed platoons with HDVs. To be more specific, RPE-MAAC integrates relative position encoding into the MAAC framework, enhancing the model's capability to handle spatial relationships among platoon members and improve platooning control effectiveness. By incorporating HDV reaction delays in the reward function, RPE-MAAC ensures the stability of the mixed platoon, reducing disruptions and enhancing driving comfort. In particular, the more accurate HDV trajectory generation model, IDM-Informer, is incorporated into the problem formulation to provide feedback that more closely resembles real-world mixed platoon scenarios. Lastly, the effectiveness and stability of the proposed RPE-MAAC-based controller is verified through numerical simulations under various conditions. The superior performance of the proposed algorithm is shown by comparing with existing traditional algorithms and state-of-the-art optimization algorithms.