Efficient Low-Rank Diffusion Model Training for Text-to-Image Generation

19 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Efficiency, Diffusion
TL;DR: We propose an end-to-end efficient training paradigm for controllable text-to-image generation.
Abstract: Recent advancements in text-to-image generation models have witnessed the success of large-scale diffusion-based generative models. However, exerting control over these models, particularly for structure-conditioned text-to-image generation, remains an open challenge. One straightforward way to achieve control is via fine-tuning, often coming at the cost of efficiency. In this work, we address this challenge by introducing ELR-Diffusion (Efficient Low-rank Diffusion), a method tailored for efficient structure-conditioned image generation. Our innovative approach leverages the low-rank decomposition of model weights, leading to a dramatic reduction in memory cost and model parameters — by up to 58\%, at the same time performing comparably to larger models trained with expansive datasets and more computational resources. At the heart of ELR-Diffusion lies a two-stage training scheme that resorts to the low-rank decomposition and knowledge distillation strategy. To provide a robust assessment of our model, we undertake a thorough comparative analysis in the controllable text-to-image generation domain. We employ a diverse array of evaluation metrics with various conditions, including edge maps, segmentation maps, and image quality measures, offering a holistic view of the model's capabilities. We believe that ELR-Diffusion has the potential to serve as an efficient foundation model for diverse user applications that demand accurate comprehension of inputs containing multiple conditional information.
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2075
Loading