Image Modeling with Deep Convolutional Gaussian Mixture ModelsDownload PDF

28 Sept 2020 (modified: 22 Oct 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: Gaussian Mixture Model, Deep Learning, Unsupervised Representation Learning, Sampling
Abstract: In this conceptual work, we present DCGMM, a deep hierarchical Gaussian Mixture Model (GMM) that is particularly suited for describing and generating images. Vanilla (i.e., "flat") GMMs require a very large number of components to well describe images, leading to long training times and memory issues. DCGMMs avoid this by a stacked architecture of multiple GMM layers, linked by convolution and pooling operations. This allows to exploit the compositionality of images in a similar way as deep CNNs do. This sets them apart from vanilla GMMs which are trained by EM, requiring a prior k-means initialization which is infeasible in a layered structure. For generating sharp images with DCGMM, we introduce a new gradient-based technique for sampling through non-invertible operations like convolution and pooling. Based on the MNIST and FashionMNIST datasets, we validate the DCGMM model by demonstrating its superiority over "flat" GMMs for clustering, sampling and outlier detection. We additionally demonstrate the applicability of DCGMM to variant generation, in-painting and class-conditional sampling.
One-sentence Summary: We present a deep Gaussian Mixture Model, leveraging typical CNN concepts like convolutions and pooling for describing images at a manageable computational cost.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2104.12686/code)
Reviewed Version (pdf): https://openreview.net/references/pdf?id=2rnDFVfH4X
5 Replies

Loading