ComGAN: Unsupervised Disentanglement and Segmentation via Image CompositionDownload PDF

Published: 31 Oct 2022, 18:00, Last Modified: 11 Jan 2023, 10:38NeurIPS 2022 AcceptReaders: Everyone
Keywords: Generative Adversarial Networks, Trivial solutions, Image Disentanglement, Unsupervised Segmentation
TL;DR: ComGAN is a flexible unsupervised model that generates realistic images and high-semantic masks, and effectively avoids trivial solutions.
Abstract: We propose ComGAN, a simple unsupervised generative model, which simultaneously generates realistic images and high semantic masks under an adversarial loss and a binary regularization. In this paper, we first investigate two kinds of trivial solutions in the compositional generation process, and demonstrate their source is vanishing gradients on the mask. Then, we solve trivial solutions from the perspective of architecture. Furthermore, we redesign two fully unsupervised modules based on ComGAN (DS-ComGAN), where the disentanglement module associates the foreground, background and mask with three independent variables, and the segmentation module learns object segmentation. Experimental results show that (i) ComGAN's network architecture effectively avoids trivial solutions without any supervised information and regularization; (ii) DS-ComGAN achieves remarkable results and outperforms existing semi-supervised and weakly supervised methods by a large margin in both the image disentanglement and unsupervised segmentation tasks. It implies that the redesign of ComGAN is a possible direction for future unsupervised work.
Supplementary Material: pdf
17 Replies