- Abstract: Despite the recent success of Lipschitz regularization in stabilizing GAN training, the exact reason of its effectiveness remains poorly understood. It is commonly believed that the main function of K-Lipschitz regularization is to restrict the L2-norm of the neural network gradient to be smaller than a threshold K (e.g. K=1) such that || grad f || <= K. While in this work, we uncover a counter-intuitive fact that under typical GAN setups, the choice of K does not matter. This finding suggests that instead of keeping the neural network gradients small, an even more important function of Lipschitz regularization is its restriction on the domain and interval of attainable gradient values of the loss function. This avoids the bias of the loss function over input samples. Empirically, we verify our proposition on the MNIST, CIFAR10 and CelebA datasets.
- Original Pdf: pdf