PreBEV: Leveraging Predictive Flow for Enhanced Bird's-Eye View 3D Dynamic Object Detection

Published: 01 Jan 2024, Last Modified: 10 Apr 2025QRS Companion 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recently, the 3D object detection problem in the autonomous driving vehicle based on the look-around camera has introduced the spatio-temporal consistency to improve the detection or instance prediction performance. Dense BEV spatial features and rich temporal information are expected to make up for the lack of object attention, ignoring the importance of dynamic target motion and position information in future image frames. We propose a novel framework called PreBEV. PreBEV introduces the predictive stream method to construct the query anchor box and learns the implicit motion characteristics of sequence images through the close association with pixels, which effectively utilizes the adaptability of BEV and image views to alleviate the computational pressure caused by dense queries. In addition, this framework proposes BEV feature fusion based on predictive flow guidance, which avoids the limitations of traditional simple time-series fusion strategies (weighted sum, series) set by hand, makes full use of the motion characteristics of objects and improves the attention to dynamic objects. PreBEV effectively simplifies the multi-task objective and improves the prediction stability and detection accuracy. The experimental results show that PreBEV performs superior on the NuScenes dataset, which brings a new research paradigm to the field of BEV detection.
Loading