HyperCGAN: Text-to-Image Synthesis with HyperNet-Modulated Conditional Generative Adversarial NetworksDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: gan, generative modelling, text-to-image, text2image, hypernetworks
Abstract: We present HyperCGAN: a conceptually simple and general approach for text-to-image synthesis that uses hypernetworks to condition a GAN model on text. In our setting, the generator and the discriminator weights are controlled by their corresponding hypernetworks, which modulate weight parameters based on the provided text query. We explore different mechanisms to modulate the layers depending on the underlying architecture of a target network and the structure of the conditioning variable. Our method enjoys high flexibility, and we test it in two scenarios: traditional image generation (on top of StyleGAN2) and continuous image generation (on top of INR-GAN). To the best of our knowledge, our work is the first one which explores text-controllable continuous image generation. In both cases, hypernetwork-based conditioning achieves state-of-the-art performance in terms of modern text-to-image evaluation measures and human studies on CUB $256^2$, COCO $256^2$, and ArtEmis $256^2$ datasets.
One-sentence Summary: A conceptually simple and general approach for text-to-image synthesis that uses hypernetworks
26 Replies

Loading