Language-vision matching for text-to-image synthesis with context-aware GAN

Yingli Hou, Wei Zhang, Zhiliang Zhu, Hai Yu

Published: 2024, Last Modified: 31 Jul 2025Expert Syst. Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We employ an efficient single-stage GAN structure that has lower parameters and faster inference speed.•Novel Context-Aware Text-Image Block improves vision-language semantic consistency for text-to-image synthesis.•Innovative Attention Convolution Module enriches the diversity and quality of synthesized images.•The mixed self-attention and convolution facilitates the understanding of complex images, improving language-vision matching.

External IDs:dblp:journals/eswa/HouZZY24