Keywords: Generative Adversarial Networks, Wasserstein, Generalization, PCA
Abstract: Generative Adversarial Networks (GANs) have become a popular method to learn a probability model from data. Many GAN architectures with different optimization metrics have been introduced recently. Instead of proposing yet another architecture, this paper aims to provide an understanding of some of the basic issues surrounding GANs. First, we propose a natural way of specifying the loss function for GANs by drawing a connection with supervised learning. Second, we shed light on the statistical performance of GANs through the analysis of a simple LQG setting: the generator is linear, the loss function is quadratic and the data is drawn from a Gaussian distribution. We show that in this setting: 1) the optimal GAN solution converges to population Principal Component Analysis (PCA) as the number of training samples increases; 2) the number of samples required scales exponentially with the dimension of the data; 3) the number of samples scales almost linearly if the discriminator is constrained to be quadratic. Moreover, under this quadratic constraint on the discriminator, the optimal finite-sample GAN performs simply empirical PCA.