Keywords: Identity preservation, prompt-based image relighting, Identity loss, Diffusion models
TL;DR: New method for identity preserving prompt-based image relighting that automatically generates pairwise images in an unsupervised manner and train just a LoRA adapter using specific losses to achieve fine-grained identity preservation.
Abstract: Diffusion-based methods are widely used for image-to-image translation tasks such as object addition/removal, colorization, and prompt-based editing. In personalized editing applications, accurately preserving a person’s identity is critical to maintain subject-specific attributes. Existing methods either use adapter networks, which struggle to retain the facial details, structure & pose of the subject, or rely on full fine-tuning of large foundation models, which is computationally expensive and requires large high-quality annotated datasets. To overcome these limitations, we propose a novel unsupervised dataset preparation pipeline that enables scalable dataset generation and a novel identity-preserving loss function that ensures fine-grained identity preservation in the generated images. Despite using a significantly lighter foundation model and fine-tuning only a fraction of its weights, our method achieves performance comparable to state-of-the-art methods. Furthermore, it has robust generalization to out-of-training prompts and generalizes to multi-person images despite training only on single-person images.
Primary Area: generative models
Submission Number: 17089
Loading