StableID: Multimodal learning for stable identity in personalized Text-to-Face generation

Xueping Wang, Yixuan Gao, Yanan Liu, Feihu Yan, Guangzhe Zhao

Published: 2025, Last Modified: 22 Jul 2025Pattern Recognit. Lett. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We propose a multimodal-guided identity guided diffusion model, named StableID.•We propose residual cross-attention based loss to learn pixel-level face details.•A portrait dataset with richer prompts is built to better capture face details.•Experiments show the superiority of the state-of-the-arts in identity-preserving.