Multi-level Gate Feature Aggregation with Spatially Adaptive Batch-Instance Normalization for Semantic Image Synthesis

Jia Long, Hongtao Lu

Published: 2021, Last Modified: 13 Jun 2025MMM (1) 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this paper, we focus on the task of generating realistic images given an input semantic layout, which is also called semantic image synthesis. Most of previous methods are based on conditional generative adversarial networks mechanism, which is stacks of convolution, normalization, and non-linearity layers. However, these methods easily generate blurred regions and distorted structures. There are two limits existing: their normalization layers are unable to make a good balance between keeping semantic layout information and geometric changes; and cannot effectively aggregated multi-level feature. To address the above problems, we propose a novel method which incorporates multi-level gate feature aggregation mechanism (GFA) and spatially adaptive batch-instance normalization (SPAda-BIN) for semantic image synthesis. Experiments on several challenging datasets demonstrate the advantage of the proposed method over existing approaches, in terms of both visual fidelity and quantitative metrics.