Cross-Modal 2D-3D Localization with Single-Modal Query

Zhipeng Zhao, Huai Yu, Chenwei Lyu, Pengliang Ji, Xiangli Yang, Wen Yang

Published: 01 Jan 2023, Last Modified: 30 Sept 2024IGARSS 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Global visual localization is an important task in geoscience with a plethora of applications such as SLAM and autonomous navigation. Current place recognition approaches restrict the modality of the query data which relies on the database data modality. However, real-world robots are equipped with different sensors in different application scenarios and it is difficult for data from a single fixed modality to accommodate all challenging environments. To overcome this limitation, we propose to build a generalized model that allows spherical images and point clouds to be retrieved under any single-modal query. Our 2D-3D dataset is created based on the KITTI360 dataset with spherical images and corresponding point clouds for training and evaluation. Extensive experimental results demonstrate the effectiveness of our proposed approach.