Compositional GAN: Learning Conditional Image Composition

Samaneh Azadi; Deepak Pathak; Sayna Ebrahimi; Trevor Darrell

Compositional GAN: Learning Conditional Image Composition

Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell

27 Sept 2018 (modified: 22 Jun 2025)ICLR 2019 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Generative Adversarial Networks (GANs) can produce images of surprising complexity and realism, but are generally structured to sample from a single latent source ignoring the explicit spatial interaction between multiple entities that could be present in a scene. Capturing such complex interactions between different objects in the world, including their relative scaling, spatial layout, occlusion, or viewpoint transformation is a challenging problem. In this work, we propose to model object composition in a GAN framework as a self-consistent composition-decomposition network. Our model is conditioned on the object images from their marginal distributions and can generate a realistic image from their joint distribution. We evaluate our model through qualitative experiments and user evaluations in scenarios when either paired or unpaired examples for the individual object images and the joint scenes are given during training. Our results reveal that the learned model captures potential interactions between the two object domains given as input to output new instances of composed scene at test time in a reasonable fashion.

Keywords: Image Composition, GAN, Conditional Image generation

TL;DR: We develop a novel approach to model object compositionality in images in a GAN framework.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/compositional-gan-learning-conditional-image/code)

9 Replies

Loading