PhraseGAN: Phrase-Boost Generative Adversarial Network for Text-to-Image GenerationDownload PDFOpen Website

Published: 01 Jan 2022, Last Modified: 15 May 2023ICME 2022Readers: Everyone
Abstract: A phrase contains an object-orienting noun and some attribution-associating words. Therefore, focusing on phrases could better generate images with the objects and their tightly relevant characteristics. We propose a Phrase-boost Gener-ative Adversarial Network (PhraseGAN) with threefold im-provement for scene level text-to-image generation. First, we propose a Transformer-based encoder to encode the in-put words and sentences and encode related words and their targeting nouns into phrases by text correlation analysis. Sec-ond, we utilize Graph Convolution Networks to measure fine-grained text-image similarity, which could gain constraints on relative positions between different objects. Finally, we de-sign a phrase-region discriminator to discriminate the qual-ity of the generated objects and the consistency between the phrases and their corresponding objects. Experimental results on the Microsoft COCO dataset demonstrate that PhraseGAN can generate better images from texts than state-of-the-art methods.
0 Replies

Loading