Marrying NeRF with Feature Matching for One-step Pose Estimation

Published: 19 Apr 2024, Last Modified: 13 May 2024RoboNerF WS 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: NeRF, One-shot Object Pose Estimation, PnP
TL;DR: We propose a fast NeRF-based pose estimation method by introducing image matching , which alleviates hundreds steps of optimization, and directly solves the pose in one-step via PnP.
Abstract: Given the image collection of an object, we aim at building a real-time image-based pose estimation method, which requires neither its CAD model nor hours of object-specific training. Recent NeRF-based methods provide a promising solution by directly optimizing the pose from pixel loss between rendered and target images. However, during inference, they require long converging time, and suffer from local minima, making them impractical for real-time robot applications. We aim at solving this problem by marrying image matching with NeRF. With 2D matches and depth rendered by NeRF, we directly solve the pose in one step by building 2D-3D correspondences between target and initial view, thus allowing for real-time prediction. Moreover, to improve the accuracy of 2D-3D correspondences, we propose a 3D consistent point mining strategy, which effectively discards unfaithful points reconstruted by NeRF. Moreover, current NeRF-based methods naively optimizing pixel loss fail at occluded images. Thus, we further propose a 2D matches based sampling strategy to preclude the occluded area. Experimental results on representative datasets prove that our method outperforms state-of-the-art methods, and improves inference efficiency by 90$\times$, achieving real-time prediction at 6 FPS. This is a shortened version of an accepted paper at ICRA 2024.
Submission Number: 26
Loading