360FusionNeRF: Panoramic Neural Radiance Fields with Joint Guidance

Shreyas Kulkarni, Peng Yin, Sebastian A. Scherer

Published: 01 Jan 2023, Last Modified: 10 Dec 2024IROS 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Based on the neural radiance fields (NeRF), we present a pipeline for generating novel views from a single 360° panoramic image. Prior research relied on the neighborhood interpolation capability of multi-layer perceptions to complete missing regions caused by occlusion. This resulted in artifacts in their predictions. We propose 360FusionNeRF, a semi-supervised learning framework that employs geometric supervision and semantic consistency to guide the progressive training process. Firstly, the input image is reprojected to 360° images, and depth maps are extracted at different camera positions. In addition to the NeRF color guidance, the depth supervision enhances the geometry of the synthesized views. Furthermore, we include a semantic consistency loss that encourages realistic renderings of novel views. We extract these semantic features using a pre-trained visual encoder CLIP, a Vision Transformer (ViT) trained on hundreds of millions of diverse 2D photographs mined from the web with natural language supervision. Experiments indicate that our proposed method is capable of producing realistic completions of unobserved regions while preserving the features of the scene. 360FusionNeRF consistently delivers state-of-the-art performance when transferring to synthetic Structured3D dataset (PSNR ~ 5%, SSIM ~3% LPIPS ~13%), real-world Matterport3D dataset (PSNR ~3%, SSIM ~3% LPIPS ~9%) and Replica360 dataset (PSNR ~8%, SSIM ~2% LPIPS ~18%). We provide the source code at https://github.com/MetaSLAM/360FusionNeRF.