SceneWeaver: Text-Driven Scene Generation with Geometry-aware Gaussian Splatting

Published: 05 Sept 2024, Last Modified: 16 Oct 2024ACML 2024 Conference TrackEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Scene Generation, Gaussian Splatting, Generative Models
Verify Author List: I have double-checked the author list and understand that additions and removals will not be allowed after the submission deadline.
TL;DR: We present SceneWeaver, a text-driven geometry-aware progressive scene generation framework to generate high-quality 3D scenes.
Abstract: With the widespread use of virtual reality applications, 3D scene generation has become a challenging new research frontier. 3D scenes have highly complex structures, so it is crucial to ensure that the output is dense, coherent, and includes all necessary structures. Many current 3D scene generation methods rely on pre-trained text-to-image diffusion models and monocular depth estimators, but they often lack rich geometric constraint information within the scene, leading to geometric distortion in the generated results. Therefore, we propose a two-stage geometry-aware progressive scene generation framework, SceneWeaver, which creates diverse, high-quality 3D scenes from text or image inputs. In the first stage, we introduce a multi-level depth refinement mechanism combined with image inpainting and point cloud updating strategies to construct a high-quality initial point cloud. In the second stage, 3D Gaussians are initialized based on the point cloud and continuously optimized. To address the challenge of insufficient geometric constraints in the Gaussian Splatting optimization process, we utilize the rich appearance and geometry information within the scene to perform a geometry-aware optimization, resulting in high-quality scene generation results. Comprehensive experiments across multiple scenes demonstrate the significant potential and advantages of our framework compared with several baselines.
A Signed Permission To Publish Form In Pdf: pdf
Supplementary Material: pdf
Primary Area: Applications (bioinformatics, biomedical informatics, climate science, collaborative filtering, computer vision, healthcare, human activity recognition, information retrieval, natural language processing, social networks, etc.)
Paper Checklist Guidelines: I certify that all co-authors of this work have read and commit to adhering to the guidelines in Call for Papers.
Student Author: Yes
Submission Number: 363
Loading