Abstract: Global visual localization is an important task in geoscience with a plethora of applications such as SLAM and autonomous navigation. Current place recognition approaches restrict the modality of the query data which relies on the database data modality. However, real-world robots are equipped with different sensors in different application scenarios and it is difficult for data from a single fixed modality to accommodate all challenging environments. To overcome this limitation, we propose to build a generalized model that allows spherical images and point clouds to be retrieved under any single-modal query. Our 2D-3D dataset is created based on the KITTI360 dataset with spherical images and corresponding point clouds for training and evaluation. Extensive experimental results demonstrate the effectiveness of our proposed approach.
Loading