Abstract: Visual localization is a fundamental task for various
applications including autonomous driving and robotics.
Prior methods focus on extracting large amounts of often
redundant locally reliable features, resulting in limited efficiency and accuracy, especially in large-scale environments under challenging conditions. Instead, we propose
to extract globally reliable features by implicitly embedding
high-level semantics into both the detection and description processes. Specifically, our semantic-aware detector is
able to detect keypoints from reliable regions (e.g. building, traffic lane) and suppress unreliable areas (e.g. sky,
car) implicitly instead of relying on explicit semantic labels.
This boosts the accuracy of keypoint matching by reducing
the number of features sensitive to appearance changes and
avoiding the need of additional segmentation networks at
test time. Moreover, our descriptors are augmented with
semantics and have stronger discriminative ability, providing more inliers at test time. Particularly, experiments
on long-term large-scale visual localization Aachen DayNight and RobotCar-Seasons datasets demonstrate that our
model outperforms previous local features and gives competitive accuracy to advanced matchers but is about 2 and
3 times faster when using 2k and 4k keypoints, respectively.
Code is available at https://github.com/feixue94/sfd2.
0 Replies
Loading