StableID: Multimodal learning for stable identity in personalized Text-to-Face generation

Published: 01 Jan 2025, Last Modified: 22 Jul 2025Pattern Recognit. Lett. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose a multimodal-guided identity guided diffusion model, named StableID.•We propose residual cross-attention based loss to learn pixel-level face details.•A portrait dataset with richer prompts is built to better capture face details.•Experiments show the superiority of the state-of-the-arts in identity-preserving.
Loading