Learning Geometry Consistent Neural Radiance Fields from Sparse and Unposed Views

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The recent progress in novel view synthesis is attributed to the Neural Radiance Field (NeRF), which requires plenty of images with precise camera poses. However, collecting available dense input images with accurate camera poses is a formidable challenge in real-world scenarios. In this paper, we propose Learning Geometry Consistent Neural Radiance Field (GC-NeRF), to tackle this challenge by jointly optimizing a NeRF and camera poses under sparse (as low as 2) and unposed images. First, GC-NeRF establishes the geometric consistencies in the image-level, which produce photometric constraints from inter- and intra-views for updating NeRF and camera poses in a fine-grained manner. Second, we adopt geometry projection with camera extrinsic parameters to present the region-level consistency supervisions, which construct pseudo-pixel labels for capturing critical matching correlations. Moreover, GC-NeRF presents an adaptive high-frequency mapping function to augment the geometry and texture information of the 3D scene. We evaluate the effectiveness of GC-NeRF, which sets a new state-of-the-art in the sparse view jointly optimized regime on multiple challenge real-world datasets.
Primary Subject Area: [Content] Media Interpretation
Relevance To Conference: Recent advancements in the Neural Radiance Field (NeRF) have shown remarkable progress in novel view synthesis, leveraging its powerful implicit scene representation capabilities. However, collecting available dense input images with accurate camera poses is a formidable challenge in real-world scenarios. This paper proposes Learning Geometry Consistent Neural Radiance Field (GC-NeRF), which is a novel method to analyze sparse and unposed camera RGB images for representing 3D scenes implicitly. Compared with challenging baselines, GC-NeRF achieves SOTA performance in complex real-world scenarios, which offers a promising approach for the novel view synthesis by interpreting visual-related media information.
Supplementary Material: zip
Submission Number: 4389
Loading