Abstract: Diffusion models have achieved remarkable progress in text-to-image generation, enabling the rise of personalized models. A key challenge in personalized generation is to provide users with precise control while ensuring high fidelity to the content. To address this, we introduce VariGen: a framework that empowers users to achieve fine-grained, layout-controllable personalized image generation. VariGen employs the Variational Detail-Aware Feature Extractor to capture intricate details from reference subjects and the Dual Layout Control Mechanism to integrate layout specifications seamlessly into the generation process. We demonstrate that VariGen achieves superior performance through extensive experimentation, offering unparalleled creative freedom and fidelity. To our knowledge, this is the first work to enable users to “create anything, anywhere” with such precision and flexibility.
External IDs:dblp:conf/ecai/CaiH0KC25
Loading