Abstract: Local feature detection and description are two essential steps in many visual applications. Most learned local feature methods require high-quality labeled data to achieve superior performance, but such labels are often expensive. To address this problem, we propose MUFeat, an unsupervised learning framework of jointly learning local feature detector and descriptor without requirement of ground-truth correspondences. MUFeat trains the network based on the putative matches from the pretrained model and two proposed unsupervised loss functions. Furthermore, the MUFeat framework includes a pyramidal feature hierarchy network to obtain keypoints and descriptors from feature maps. Experiments indicate that MUFeat outperforms most state-of-the-art supervised learning methods on image matching, medical image registration and visual localization tasks.
Loading