Abstract: 3D content creation has long been a complex and time-consuming process, often requiring specialized skills and resources. While recent advancements have allowed for text-guided 3D object and scene generation, they still fall short of providing sufficient control over the generation process, leading to a gap between the user's creative vision and the generated results. In this paper, we present iControl3D, a novel interactive system that empowers users to generate and render customizable 3D scenes with precise control. To this end, a 3D creator interface has been developed to provide users with fine-grained control over the creation process. Technically, we leverage 3D meshes as an intermediary proxy to iteratively merge individual 2D diffusion-generated images into a cohesive and unified 3D scene representation. To ensure seamless integration of 3D meshes, we propose to perform boundary-aware depth alignment before fusing the newly generated mesh with the existing one in 3D space. Additionally, to effectively manage depth discrepancies between remote content and foreground, we propose to model remote content separately with an environment map instead of 3D meshes. Finally, our neural rendering interface enables users to build a radiance field of their scene online and navigate the entire scene. Extensive experiments have been conducted to demonstrate the effectiveness of our system.
Relevance To Conference: Our work offers control and flexibility in generating customizable scenes. Traditional methods often lack precision, leading to a gap between user vision and generated results, since only text prompts are available. Our proposed iControl3D addresses this by providing a user-friendly interface for fine-grained control throughout the creation process. By incorporating additional input conditions such as text prompts, user scribbles, semantic segmentation maps, depth, and other relevant information, iControl3D empowers users to exert fine-grained control over the generation process. This feature facilitates the creation of intricate and detailed 3D scenes, enhancing the overall complexity and fidelity of the output.
Supplementary Material: zip
Primary Subject Area: [Generation] Generative Multimedia
Submission Number: 90
Loading