Magnifying the Three Phases of GAN Training — Fitting, Refining and Collapsing

TMLR Paper2643 Authors

08 May 2024 (modified: 27 Jun 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Generative Adversarial Networks (GANs) are efficient generative models but may suffer from mode mixture and mode collapse. We present an original global characterization of GAN training by dividing it into three successive phases — fitting, refining, and collapsing. Such a characterization underscores a strong correlation between mode mixture and the refining phase, as well as mode collapse and the collapsing phase. To analyze the causes and features of each phase, we propose a novel theoretical framework that integrates both continuous and discrete aspects of GANs, addressing a gap in existing literature that predominantly focuses on only one aspect. We develop a specialized metric to detect the phase transition from refining to collapsing and integrate it in an "early stopping" algorithm to optimize GAN training. Experiments on synthetic datasets and real-world datasets including MNIST, Fashion MNIST and CIFAR-10 substantiate our theoretical insights and highlight the efficacy of our algorithm.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: - Added UMAP plots of MNIST to Fig. 1, illustrating how image distributions are *analogous* to Gaussian mixtures, with details such as the effects of different initialization methods provided in the appendices. - Added analyses on how a class of *suboptimal* discriminators affects the vector field that updates particles (complementing Section 3) and the evolution of steepness (complementing Section 4). - Added *data-dependent theoretical results* to Sections 3.1 and 3.2 (previously Sections 3.2 and 3.3). Moved the original Theorem 3.1 and its implications to the appendices, leaving only the conclusions in the main text for brevity. - Modified Alg. 1 and Thm. 2.1 to emphasize that the *stop gradient operator* is applied to $\hat{Z}_i$'s. - Added Table 2 to summarize the differences in Fig. 3. - Added several references. - Fixed some typos.
Assigned Action Editor: ~Michael_U._Gutmann1
Submission Number: 2643
Loading