Refined Tensorial Radiance Field: Harnessing coordinate based networks for novel view synthesis from sparse inputs

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: neural radiance field, multi-plane encoding, coordinate-based network, sparse-inputs, few-shots, regularization
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: With help of coordinate-based networks known for strong low-frequency bias, multi-plane encoding reliably constructs both static and dynamic NeRFs under the sparse input situations.
Abstract: The multi-plane encoding approach has been highlighted for its ability to serve as static and dynamic neural radiance fields without sacrificing generality. This approach constructs related features through projection onto learnable planes and interpolating adjacent vertices. This mechanism allows the model to learn fine-grained details rapidly and achieves outstanding performance. However, it has limitations in representing the global context of the scene, such as object shapes and dynamic motion over times when available training poses are sparse. In this work, we propose refined tensorial radiance fields that harness coordinate-based networks known for strong bias toward low-frequency signals. The coordinate-based network is responsible for capturing global context, while the multi-plane network focuses on capturing fine-grained details. We demonstrate that using residual connections effectively preserves their inherent properties. Additionally, the proposed curriculum training scheme accelerates the disentanglement of these two features. We empirically show that the proposed method achieves comparable results to multi-plane encoding with high denoising penalties in static NeRFs. Meanwhile, it outperforms others for the task with dynamic NeRFs using sparse inputs. In particular, we prove that excessively increasing denoising regularization for multi-plane encoding effectively eliminates artifacts; however, it can lead to artificial details that appear authentic but are not present in the data. On the other hand, we note that the proposed method does not suffer from this issue.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7703
Loading