DualFlow: Dual Diffusion Branches for Cross-modal Information Flow in Graphic Design Template Generation
Keywords: Layout Generation, Graphic Design
Abstract: In this paper, our goal is to automate design template creation that generates a background image and a layout of foreground elements over the background to form a harmonious composition from an input text. Prior works on design template generation opt to generate the background and layout sequentially using two separate models and model the dependency between them by simply conditioning one model on the final output of the other. Hence, these methods fall short of capturing intricate interaction between the background and layout.
To overcome this limitation, we propose a diffusion model, DualFlow, which jointly generates background images and layouts in a single generative process. The novel design of our joint model's denoising network connects the backbones of pre-trained image and layout diffusion models with a carefully designed, learnable communication module. At training time, the image and layout backbones are frozen to maintain the pre-trained priors, while the communication module is trained from scratch to focus on learning subtle image-layout interaction to generate more harmonious compositions.
Furthermore, we introduce two metrics, TemplateFID and TemplateCLIP, to assess the quality of generated design templates holistically.
Our experiments show that, compared with prior approaches, our model can achieve significantly better results, and produce outputs that are closer to real samples. We also demonstrate the flexibility of our model in enforcing additional design principles at inference without retraining.
Primary Area: generative models
Submission Number: 7101
Loading