I2D-Loc++: Camera Pose Tracking in LiDAR Maps With Multi-View Motion Flows

Huai Yu; Kuangyi Chen; Wen Yang; Sebastian A. Scherer; Gui-Song Xia

I2D-Loc++: Camera Pose Tracking in LiDAR Maps With Multi-View Motion Flows

Huai Yu, Kuangyi Chen, Wen Yang, Sebastian A. Scherer, Gui-Song Xia

Published: 01 Jan 2024, Last Modified: 14 Nov 2024IEEE Robotics Autom. Lett. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Camera localization in LiDAR maps has become increasingly popular due to its promising ability to handle complex scenarios, surpassing the limitations of visual-only localization methods. However, existing approaches mostly focus on addressing the cross-modal 2D–3D gaps while overlooking the relationship between adjacent image frames, which results in fluctuations and unreliability of camera poses. To alleviate this, we introduce a novel camera pose tracking framework in LiDAR maps by coupling the 2D–3D correspondences with 2D–2D feature matching (I2D-Loc++), which establishes the multi-view geometric constraints to improve localization stability and trajectory smoothness. Specifically, the framework consists of a front-end hybrid flow estimation network and a non-linear least square pose optimization module. We further design a cross-modal consistency loss to integrate the multi-view motion flows for the network training and the back-end pose optimization. The pose tracking model is trained on the KITTI odometry dataset, and tested on the KITTI odometry, Argoverse, Waymo and Lyft5 datasets, which demonstrates that I2D-Loc++ has superior performance and good generalization ability in improving the accuracy and robustness of camera pose tracking.

Loading