CDRP3: Cascade Deep Reinforcement Learning for Urban Driving Safety With Joint Perception, Prediction, and Planning

Yuxiang Yang, Fenglong Ge, Jinlong Fan, Jufeng Zhao, Zhekang Dong

Published: 01 Jan 2025, Last Modified: 30 Jul 2025IEEE Trans. Intell. Transp. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Safe urban driving is challenging due to the high density of traffic flow and various potential hazards, such as the sudden appearance of unknown objects. Traditional rule-based approaches and imitation learning methods struggle to address the diverse driving scenarios encountered in urban environments. Reinforcement learning (RL), which adapts to a wide range of driving scenarios through continuous interaction with the environment, has demonstrated success in autonomous driving. Making safety decisions when driving in urban environments necessitates a comprehensive perception of the current scene and the ability to predict the evolution of the dynamic scene. In this paper, we present a novel cascade deep reinforcement learning framework, CDRP3, designed to enhance the safety decision-making capabilities of self-driving vehicles in complex scenarios and emergencies. We leverage a multi-modal spatio-temporal perception (MmSTP) module to fuse multi-modal sensor data and introduce temporal perception to capture spatio-temporal information about dynamic driving environments, and a future state prediction (FSP) module to model complex interactions between different traffic participants and explicitly predict their future states. Subsequently, in the PPO-based planning module, we use the comprehensive environmental information obtained from perception and prediction to decode an optimized driving strategy using a lateral and longitudinal separated multi-branch network structure guided by a customized reward function. This approach enables knowledge transfer from the perception and prediction components to planning, and planning-oriented enhancement of safety decision-making capabilities to improve driving safety. Our experiments demonstrate that CDRP3 outperforms state-of-the-art methods, providing superior driving safety in complex urban environments.

External IDs:dblp:journals/tits/YangGFZD25