CaliFree3DLane: Calibration Free Spatio-Temporal BEV Representation for Monocular 3D Lane Detection

Published: 01 Jan 2025, Last Modified: 13 Feb 2025IEEE Trans. Intell. Transp. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Monocular 3D lane detection plays a crucial role in autonomous driving, assisting vehicles in safe navigation. Existing methods primarily utilize calibrated camera parameters in the dataset to conduct 3D lane detection from a single image. However, errors or sudden absence of camera parameters can pose significant challenges to safe driving. On one hand, this can lead to incorrect feature acquisition, which further affects the precision of lane detection. On the other hand, it renders methods relying on transformation matrices for temporal fusion ineffective. To address the above issue and achieve accurate 3D lane detection, we propose CaliFree3DLane, a calibration-free method for spatio-temporal 3D lane detection based on Transformer structure. Instead of using geometric projections to obtain static reference points on images, we propose a reference point refinement strategy that dynamically updates the reference points and finally generates appropriate sampling points for image feature extraction. To integrate multi-frame features, we generate sub-queries from the current scene query to focus on the image features of each frame independently. We then aggregate these sub-queries to form a more comprehensive scene query for 3D lane detection. Using these operations, CaliFree3DLane accurately transforms multi-frame image features into the current bird’s-eye view (BEV) space, enabling precise 3D lane detection. Experimental results show that our CaliFree3DLane achieves state-of-the-art 3D lane detection performance in various datasets. Compared to the Transformer-based methods of the same type, we have also improved $ {6.0\%}\sim {10.5\%}$ at the F1 score. Code is available at https://github.com/Ciisrlab/CaliFree3DLane.
Loading