Realistic image generation using adversarial generative networks combined with depth information

Published: 01 Jan 2023, Last Modified: 11 Apr 2025Digit. Signal Process. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Existing image generation tasks produce blurry, unrealistic results and images that lack layers and structure. Depth information can be used to accurately control the relative positions and hierarchies between different objects in an image. Our goal is to enhance the realism, hierarchy, and quality of generated images by using depth information in image-to-image tasks. To address these issues, we propose a multi-conditional semantic image generation method that fuses depth information. The method is based on the network structure of Generative Adversarial Networks and fuses the depth information of multi-conditional inputs by using pairs of semantic labels and depth maps as inputs through our proposed Multi-scale Feature Extraction and Information Fusion Module. Furthermore, we add a channel-attention mechanism to the generator to strengthen the interconnectivity between channels and suppress confusion between different semantic features. With less increase in training cost, the module proposed in this paper can generate real images that match the input semantic layout. Through extensive testing on three challenging datasets, the images generated by this model produce superior visuals and data metrics, demonstrating the effectiveness of our proposed method.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview