Conditioning Trick for Training Stable GANs

MOHAMMAD ESMAEILPOUR; Raymel Alfonso Sallo; Olivier St-Georges; Patrick Cardinal; Alessandro Lameiras Koerich

Conditioning Trick for Training Stable GANs

MOHAMMAD ESMAEILPOUR, Raymel Alfonso Sallo, Olivier St-Georges, Patrick Cardinal, Alessandro Lameiras Koerich

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Generative Adversarial Network, Departure From Normality, Schur Decomposition, Spectrogram Synthesis

Abstract: In this paper we propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training. We force the generator to get closer to the departure from normality function of real samples computed in the spectral domain of Schur decomposition. This binding makes the generator amenable to truncation and does not limit exploring all the possible modes. We slightly modify the BigGAN architecture incorporating residual network for synthesizing 2D representations of audio signals which enables reconstructing high quality sounds with some preserved phase information. Additionally, the proposed conditional training scenario makes a trade-off between fidelity and variety for the generated spectrograms. The experimental results on UrbanSound8k and ESC-50 environmental sound datasets and the Mozilla common voice dataset have shown that the proposed GAN configuration with the conditioning trick remarkably outperforms baseline architectures, according to three objective metrics: inception score, Frechet inception distance, and signal-to-noise ratio.

One-sentence Summary: This paper is about improving stability of GANs in training by constraining the generator using the departure from normality metric for audio representation synthesis.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=5c3_wq3sPp

10 Replies

Loading