UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video

Published: 09 Apr 2024, Last Modified: 09 Apr 2024SynData4CVEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Inverse rendering, neural radiance field, outdoor scene
TL;DR: UrbanIR decompose scene properties and perform view synthesis, relighting, and object insertion from single video under same illumination
Abstract: We present UrbanIR~(Urban Scene Inverse Rendering), a new inverse graphics model that enables realistic, free-viewpoint renderings of scenes under various lighting conditions with a single video. It accurately infers shape, albedo, visibility, and sun and sky illumination from wide-baseline videos, such as those from car-mounted cameras, differing from NeRF's dense view settings. In this context, standard methods often yield subpar geometry and material estimates, such as inaccurate roof representations and numerous `floaters'. UrbanIR addresses these issues with novel losses that reduce errors in inverse graphics inference and rendering artifacts. Its techniques allow for precise shadow volume estimation in the original scene. The model's outputs support controllable editing, enabling photorealistic free-viewpoint renderings of night simulations, relit scenes, and inserted objects, marking a significant improvement over existing state-of-the-art methods. Our code and data will be made publicly available upon acceptance.
Supplementary Material: pdf
Submission Number: 33