Abstract: Interest point detection and local feature description are fundamental steps in many
computer vision applications. Classical approaches are based on a detect-thendescribe paradigm where separate handcrafted methods are used to first identify
repeatable keypoints and then represent them with a local descriptor. Neural
networks trained with metric learning losses have recently caught up with these
techniques, focusing on learning repeatable saliency maps for keypoint detection
or learning descriptors at the detected keypoint locations. In this work, we argue
that repeatable regions are not necessarily discriminative and can therefore lead
to select suboptimal keypoints. Furthermore, we claim that descriptors should be
learned only in regions for which matching can be performed with high confidence.
We thus propose to jointly learn keypoint detection and description together with
a predictor of the local descriptor discriminativeness. This allows to avoid ambiguous areas, thus leading to reliable keypoint detection and description. Our
detection-and-description approach simultaneously outputs sparse, repeatable and
reliable keypoints that outperforms state-of-the-art detectors and descriptors on the
HPatches dataset and on the recent Aachen Day-Night localization benchmark.
0 Replies
Loading