Transforming the latent space of stylegan for real face editing

Heyi Li, Xinyu Zhang, Jinlong Liu, Yunzhi BAI, Huayan Wang, Klaus Mueller

Published: 31 May 2024, Last Modified: 02 Oct 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Despite recent advances in semantic manipulation using StyleGAN, semantic editing of real faces remains challenging. The gap between the W space and the W+ space demands an undesirable trade-off between reconstruction quality and editing quality. To solve this problem, we propose to expand the latent space by replacing fully connected layers in StyleGAN’s mapping network with attention-based transformers. This simple and effective technique integrates the two spaces mentioned above and transforms them into one new latent space called W++. Our modified StyleGAN maintains the state-of-the-art generation quality of the original StyleGAN with moderately better diversity. But more importantly, the proposed W++ space achieves superior performance in both reconstruction quality and editing quality. Besides these significant advantages, our W++ space supports existing inversion algorithms and editing methods with only negligible modifications thanks to its structural similarity with the W/W+ space. Extensive experiments on the FFHQ dataset prove that our proposed W++ space is evidently more preferable than the previous W/W+ space for real face editing. The code is publicly available for research purposes at https://github.com/AnonSubm2021/TransStyleGAN