Cons2Plan: Vector Floorplan Generation from Various Conditions via a  Learning Framework based on Conditional Diffusion Models

Shibo Hong; Xuhong Zhang; Tianyu Du; Sheng Cheng; Xun Wang; Jianwei Yin

Cons2Plan: Vector Floorplan Generation from Various Conditions via a Learning Framework based on Conditional Diffusion Models

Shibo Hong, Xuhong Zhang, Tianyu Du, Sheng Cheng, Xun Wang, Jianwei Yin

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The field of floorplan generation has attracted significant interest from the community. Remarkably, recent progress in methods based on generative models has substantially promoted the development of floorplan generation. However, generating floorplans that satisfy various conditions remains a challenging task. This paper proposes a learning framework, named Cons2Plan, for automatically and high-quality generating vector floorplans from various conditions. The input conditions can be graphs, boundaries, or a combination of both. The conditional diffusion model is the core component of our Cons2Plan. The denoising network uses a conditional embedding module to incorporate the conditions as guidance during the reverse process. Additionally, Cons2Plan incorporates a two-stage approach that generates graph conditions based on boundaries. It utilizes three regression models for node prediction and a novel conditional edge generation diffusion model, named CEDM, for edge generation. We conduct qualitative evaluations, quantitative comparisons, and ablation studies to demonstrate that our method can produce higher-quality floorplans than those generated by state-of-the-art methods.

Primary Subject Area: [Generation] Generative Multimedia

Relevance To Conference: This work, through the development of Cons2Plan, contributes significantly to multimedia/multimodal processing by introducing a versatile framework capable of generating vector floorplans from diverse inputs such as graphs and boundaries. At the heart of Cons2Plan is a conditional diffusion model, which integrates various conditions into the floorplan generation process through a conditional embedding module. This integration allows the system to process and interpret multimodal inputs, adapting its output to satisfy complex, multifaceted conditions. Moreover, the two-stage approach for generating graph conditions from boundaries showcases an advanced capability in processing and synthesizing information across different modalities. By employing regression models and a conditional edge generation diffusion model, Cons2Plan can autonomously create logical and coherent spatial layouts, even in the absence of explicit graph-based inputs from users. This autonomy in generating diverse and reasonable graphs underscores the framework's advanced multimodal processing capabilities. The conducted qualitative evaluations, quantitative comparisons, and ablation studies further affirm that Cons2Plan not only enhances the quality of floorplan generation but also pushes the boundaries of multimedia and multimodal processing by handling and integrating heterogeneous data types to produce coherent and functional outputs.

Submission Number: 1484

Loading