Keywords: Multi-view Consistent 3D Editing, Text-based 3D Editing, Gaussian Splatting, Diffusion Model
TL;DR: This paper proposes Novel View Editing Adaptor, which enables 3D editing systems to maintain consistent editing effects even in unseen views.
Abstract: 3D editing aims to transform a given 3D structure according to the user's intent. Multi-view consistent 3D editing has been proposed to ensure consistent editing effects across different views of a 3D model, resulting in high-quality 3D structures. However, such consistency is only observed from viewpoints near the trained reference images, while renderings from other viewpoints (i.e., unseen view) often appear inconsistent and blurry. This is because current 3D editing systems are heavily optimized only for the given reference viewpoints. In real-world scenarios, it is also challenging for human to manually identify all regular viewpoints that ensure consistent 3D quality. To this end, we propose Novel View Editing Adapter (NVE-Adaptor), which enables 3D editing systems to maintain consistent quality even in unseen views. NVE-Adaptor supplements the limited reference views by sensibly exploring novel view in 3D space with rendered images from those views. These images are then refined using diffusion-based editing and used as additional supervision to improve view consistency. Our concept is simple, model-agnostic, and broadly applicable to multi-view 3D editing systems. We demonstrate its effectiveness on two 3D scene benchmarks (Mip-NeRF 360, Instruct-NeRF2NeRF) as well as on real-world data. The code will be made publicly available.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 24113
Loading