OGGSplat: Open-Vocabulary Gaussian Growing for Expanded Field-of-View

18 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Open Vocabulary, 3D Gaussian Splatting, Diffusion Inpainting
Abstract: Reconstructing open-vocabulary 3D scenes from sparse views is both challenging and crucial, driven by the demands of emerging applications such as virtual reality and embodied AI. However, existing generalizable open-vocabulary 3D Gaussian Splatting methods struggle to reconstruct semantically enriched regions outside the input view cone. To address this limitation, we introduce OGGSplat, an open-vocabulary Gaussian growing method that extends the field-of-view for generalizable, semantically-enriched 3D scene reconstruction. Our key insight is that the semantic attributes of open-vocabulary Gaussians serve as strong priors for image extrapolation, ensuring both semantic consistency and visual plausibility. Specifically, once Gaussians with semantic attributes are initialized from sparse views, we introduce an RGB-semantic consistent inpainting module to selected rendered views. This module enables bidirectional control between an image diffusion model and a semantic diffusion model. The inpainted regions integrated with semantics are then lifted back into 3D space for efficient, progressive optimization of Gaussian parameters. To evaluate our method, we propose the Open-Vocabulary Gaussian Outpainting (OVGO) benchmark, which measures both the semantic and generative quality of the reconstructed open-vocabulary scenes. OGGSplat also demonstrates promising semantic-aware reconstruction capabilities when provided with two views captured directly from a smartphone camera.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 11821
Loading