Keywords: Keypoint Detection, Uncertainty Estimation, 3D Vision, Structure from Motion, Self-supervised learning
TL;DR: RaCo learns to detects repeatable keypoints, order them meaningfully and estimate their spatial uncertainty, through self-supervision on perspective image crops.
Abstract: This paper introduces RaCo, a lightweight neural network designed to learn robust and versatile keypoints for a variety of 3D computer vision tasks. The model has three key components: a repeatable keypoint detector, a differentiable ranker to maximize matches with a limited number of keypoints, and a covariance estimator to quantify spatial uncertainty in metric scale.
RaCo is trained only on perspective image crops and doesn't need covisible image pairs. It achieves strong rotational robustness through extensive data augmentation, avoiding computationally expensive equivariant network architectures. The method was evaluated on several challenging datasets and demonstrated state-of-the-art performance in keypoint repeatability and two-view matching, especially with large in-plane rotations.
Ultimately, RaCo provides a simple and effective strategy to independently estimate keypoint ranking and metric covariance without additional labels. It detects interpretable and repeatable interest points, and its model and training code will be publicly released with a permissive license.
Supplementary Material: zip
Submission Number: 361
Loading