AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation

16 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LoRA Retrieval, LoRA Fusion, Diffusion Model
Abstract: Despite remarkable progress in photorealistic image generation with large-scale diffusion models such as FLUX and Stable Diffusion v3, the fragmented ecosystem of community-developed LoRA adapters and the difficulty of systematically integrating them into foundational models hinder their practical deployment. Their widespread adoption faces three pressing challenges: sparse metadata annotation, the requirement for zero-shot adaptation, and suboptimal strategies for multi-LoRA fusion. To address these challenges, we propose a framework that unifies community-developed LoRA adapters through semantic retrieval and dynamic fusion, effectively functioning as an ecosystem integrator. The framework consists of two key components: (1) a weight encoding-based retriever that aligns LoRA parameter matrices with text prompts in a shared semantic space, thereby eliminating the need for original training data, and (2) a fine-grained gated fusion mechanism that computes context-specific fusion weights across network layers and diffusion timesteps, enabling the optimal integration of multiple LoRA modules during generation. Experiments demonstrate that our approach outperforms strong baselines, improving aesthetic scores by up to 5\%, maintaining fidelity when fusing up to 3 LoRAs where prior methods fail. This establishes a practical bridge between the community-driven proliferation of LoRA modules and the deployment requirements of large-scale diffusion systems, enabling scalable and data-efficient model.
Primary Area: generative models
Submission Number: 7058
Loading