Factorized Diffusion Architectures for Unsupervised Image Generation and Segmentation

Xin Yuan; Michael Maire

Factorized Diffusion Architectures for Unsupervised Image Generation and Segmentation

Xin Yuan, Michael Maire

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion models, unsupervised learning, image segmentation, neural network architecture

Abstract: We develop a neural network architecture which, trained in an unsupervised manner as a denoising diffusion model, simultaneously learns to both generate and segment images. Learning is driven entirely by the denoising diffusion objective, without any annotation or prior knowledge about regions during training. A computational bottleneck, built into the neural architecture, encourages the denoising network to partition an input into regions, denoise them in parallel, and combine the results. Our trained model generates both synthetic images and, by simple examination of its internal predicted partitions, semantic segmentations of those images. Without fine-tuning, we directly apply our unsupervised model to the downstream task of segmenting real images via noising and subsequently denoising them. Experiments demonstrate that our model achieves accurate unsupervised image segmentation and high-quality synthetic image generation across multiple datasets.

Primary Area: Diffusion based models

Submission Number: 14105

Loading