Rep2Face: Synthetic Face Generation with Identity Representation Sampling

Published: 10 Nov 2025, Last Modified: 13 Nov 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0
Abstract: In light of escalating legal, privacy and ethical concerns surrounding real-world face datasets, synthesizing face datasets have emerged as a crucial alternative for training face recognition model. However, generating synthetic datasets presents significant challenges: it must capture essential characteristics and mitigate inherent issues present in real datasets. Existing methods predominantly focus on identity-driven image generation while overlooking the critical role of synthetic identities. As anchors, their quality directly determines the upper bound of synthetic datasets. Inspired by this, we propose a novel approach for synthetic face generation with identity representation sampling (Rep2Face). We train Identity Representation Diffusion Model (IRDM) to model the distribution of real identity representations and designs an identity sampling strategy to ensure intra-class variation, inter-class separability and racial balance in synthetic identity representations. Starting from the synthetic identities, Rep2Face generates face datasets that not only approximate real-world face distributions but also preserve the desired properties of these synthetic identities. Experimental results demonstrate that Rep2Face achieves promising performance on benchmark datasets and race-balanced datasets, and even achieves higher accuracy than real datasets of the same scale on extreme-condition datasets. The code is available at https://github.com/jinchao2001/Rep2Face.
Loading