Abstract: This paper proposes a multi-map based visual localization method for image sequences. Given multiple single-map based localization results, we combine them with SLAM to estimate robust and accurate camera poses under challenging conditions. Our method comprises three modules connected in a sequence. First, we reconstruct multiple reference maps using the Structure-from-Motion technique, one map for each reference sequence. A single-image-based localization pipeline is performed to estimate 6-DoF camera poses for each query image, one for each map. Second, a consensus set maximization module is proposed to select the best camera poses from multi-map poses, estimating one 6-DoF camera pose for each query image. Finally, a robust pose refinement module is proposed to optimize 6-DoF camera poses of query images, combining map-based localization and local SLAM information. Experiments show that the proposed pipeline achieves state-of-the-art performance on challenging map-based localization benchmarks. Demonstrating the broad applicability of our method, we obtained first place in the challenge of Map-Based Localization for Autonomous Driving at ECCV2022.
External IDs:dblp:conf/iros/LinLLL23
Loading