Abstract: Text conditioned generative models for images have yielded impressive results. Text conditioned floorplan generation as a special type of raster image generation task also received particular attention.
However there are many use cases in floorplan generation where numerical properties of the generated result are more important than the aesthetics. For instance, one might want to specify sizes for certain rooms in a floorplan and compare the generated floorplan with given specifications. Current approaches, datasets and commonly used evaluations do not support these kinds of constraints. As such, an attractive strategy is to generate an intermediate data structure that contains numerical properties of a floorplan which can be used to generate the final floorplan image. To explore this setting we (1) construct a new dataset for this data-structure to data-structure formulation of floorplan generation using two popular image based floorplan datasets RPLAN and ProcTHOR-10k, and provide the tools to convert further procedurally generated ProcTHOR floorplan data into our format.
(2) We explore the task of floorplan generation given a partial or complete set of constraints and we design a series of metrics and benchmarks to enable evaluating how well samples generated from models respect the constraints.
(3) We create multiple baselines by finetuning a large language model (LLM), Llama3, and demonstrate the feasibility of using floorplan data structure conditioned LLMs for the problem of floorplan generation respecting numerical constraints.
We hope that our language-based approach to this image-based design problem and our newly developed benchmarks will encourage further research on different ways to improve the performance of LLMs and other generative modelling techniques for generating designs where quantitative constraints are only partially specified, but must be respected.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: 1. Fixed typos and revised writing
2. Added LLM usage examples in Appendix J
3. Fixed language in Abstract
4. Expanded analysis sections on issues raised by the reviewers
5. Added natural language prompt results Result section
6. Added visual comparison with prior methods in Appendix K
7. Added comments about LLM's struggle in understanding geometry in result analysis
Assigned Action Editor: ~Chunyuan_Li1
Submission Number: 3727
Loading