Perception Helps Planning: Facilitating Multi-Stage Lane-Level Integration via Double-Edge Structures

Published: 01 Jan 2025, Last Modified: 11 Apr 2025IEEE Robotics Autom. Lett. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: When planning for autonomous driving, it is crucial to consider essential traffic elements such as lanes, intersections, traffic regulations, and dynamic agents. However, they are often overlooked by the traditional end-to-end planning methods, likely leading to inefficiencies and non-compliance with traffic regulations. In this work, we endeavor to integrate the perception of these elements into the planning task. To this end, we propose Perception Helps Planning (PHP), a novel framework that reconciles lane-level planning with perception. This integration ensures that planning is inherently aligned with traffic constraints, facilitating safe and efficient driving. Specifically, PHP focuses on both edges of a lane for planning and perception purposes, taking into account the positions of both lane edges in Bird's Eye View (BEV), along with attributes related to lane intersections, lane directions, and lane occupancy. In the algorithmic design, the process begins with the transformer encoding multi-camera images to extract the above features and predict lane-level perception results. Next, the hierarchical feature early fusion module refines the features for predicting planning attributes. Finally, a specific interpreter utilizes a late-fusion process designed to integrate lane-level perception and planning information, culminating in generating vehicle control signals. Experiments on three Carla benchmarks show significant improvements in driving scores of 27.20%, 33.47%, and 15.54% over existing algorithms, respectively, achieving state-of-the-art performance, with the system operating up to 22.57 FPS.
Loading