A Dual-Branch Disentanglement Diffusion for ID-Attribute Conditional Face Generation

19 Sept 2025 (modified: 18 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Face Generation, Face Identity, Facial Atrributes, Diffusion Models
Abstract: Face identity customization, i.e., face generation with specified identity, has received increasing attention owing to its extensive applications in personalized content creation. Although existing methods achieve high consistency in identity with reference faces, they still struggle to precisely manipulate fine-grained facial attributes. We attribute this issue to the inherent entanglement of identity and attribute information, as well as the lack of attribute-specific supervision. Accordingly, to address this issue, we propose AttPortrait, a high-quality identity-attribute conditional face generation framework. Based on a foundational face diffusion model, we introduce an extra disentanglement branch alongside the conventional denoising branch during the training stage. This extra branch employs explicit attribute supervision to encourage the model to capture the attribute information from the text prompts, effectively disentangling the identity and attributes and achieving precise attribute manipulation with high identity consistency. Comprehensive experiments demonstrate that our method achieves at least 34% improvement in attribute accuracy, attains identity similarity close to the state-of-the-art methods, and maintains comparable FID scores on both real and synthetic datasets.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 15474
Loading