Abstract: Conditional Generative Adversarial Network (cGAN) is an important type of GAN which is often equipped with an auxiliary classifier. However, existing cGANs usually have the issue of mode collapse which can incur unstable performance in practice. In this paper, we propose a novel stable training method for cGANs with well preserving the generation fidelity and diversity. Our key ideas are designing efficient adversarial training strategies for the auxiliary classifier and mitigating the overconfidence issue caused by the cross-entropy loss. We propose a classifier-based cGAN called Confidence Guided Generative Adversarial Networks (CG-GAN) by introducing the adversarial training to a $K$-way classifier. In particular, we show in theory that the obtained $K$-way classifier can encourage the generator to learn the real joint distribution. To further enhance the performance and stability, we propose to establish a high-entropy prior label distribution for the generated data and incorporate a reverse KL divergence term into the minimax loss of CG-GAN. Through a comprehensive set of experiments on the popular benchmark datasets, including the large-scale dataset ImageNet, we demonstrate the advantages of our proposed method over several state-of-the-art cGANs.
Primary Subject Area: [Generation] Generative Multimedia
Relevance To Conference: In this paper, we propose a novel stable training method for conditional Generative Adversarial Networks (cGANs) that aims to address prevalent issues and improve practical performance. Generative Adversarial Networks (GAN) is a popular generative model for high-fidelity image generation. Conditional GAN (cGAN) is an important type of GAN used for conditional image generation in multimedia systems. However, existing cGANs commonly struggle with training collapse, resulting in unstable generation quality, limited diversity, and sub-par performance in applied settings. Our proposed approach seeks to enhance the stability of cGANs while preserving the quality and diversity of the generation. We believe this research has the potential to strengthen the capabilities of cGANs for practical multimedia applications.
Supplementary Material: zip
Submission Number: 1834
Loading