Contrastive Learning on Synthetic Videos for GAN Latent Disentangling

Kevin Duarte; Wei-An Lin; Ratheesh Kalarot; Jingwan Lu; Eli Shechtman; Shabnam Ghadar; Mubarak Shah

Contrastive Learning on Synthetic Videos for GAN Latent Disentangling

Kevin Duarte, Wei-An Lin, Ratheesh Kalarot, Jingwan Lu, Eli Shechtman, Shabnam Ghadar, Mubarak Shah

03 Oct 2022 (modified: 05 May 2023)Neurips 2022 SyntheticData4MLReaders: Everyone

Keywords: generative models, contrastive learning, synthetic data, latent disentangling

TL;DR: We learn to disentangle GAN latent representations by using a contrastive objective on frames of synthetically generated videos

Abstract: In this paper, we present a method to disentangle appearance and structural information in the latent space of StyleGAN. We train an autoencoder whose encoder extracts appearance and structural features from an input latent code and then reconstructs the original input using the decoder. To train this network, We propose a video-based latent contrastive learning framework. With the observation that the appearance of a face does not change within a short video, the encoder learns to pull appearance representations of various video frames together while pushing appearance representations of different faces apart. Similarly, the structural representations of augmented versions of the same frame are pulled together, while the representation across different frames are pushed apart. As face video datasets lack sufficient number of unique identities, we propose a method to synthetically generate videos. This allows our disentangling network to observe a larger variation of appearances, expressions, and poses during training. We evaluate our approach on the tasks of expression transfer in images and motion transfer in videos.

Supplementary Material: zip

4 Replies

Loading