Keywords: deep mixture models, sum-product networks, probabilistic circuits, image modeling
Abstract: We present the Deep Convolutional Gaussian Mixture Model (DCGMM), a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference. DCGMM instances exhibit a CNN-like layered structure, in which the principal building blocks are convolutional Gaussian Mixture (cGMM) layers. A key innovation w.r.t. related models lile sum-produdct networks (SPNs) and probabilistic circuits (PCs) is that each cGMM layer optimizes an independent loss function and therefore has an independent probabilistic interpretation. This modular approach permits intervening transformation layers to harness the full spectrum of
(potentially non-invertible) mappings available to CNNs, e.g., max-pooling or (half-)convolutions. DCGMM sampling and inference are realized by a deep chain of hierarchical priors, where samples generated by each cGMM layer parameterize sampling in the next-lower cGMM layer. For sampling through non-invertible transformation layers, we introduce a new gradient-based sharpening technique that exploits redundancy (overlap) in, e.g., half-convolutions. The basic quantities forward-transported through a DCGMM instance are the posterior probabilities of cGMM layers, which ensures numerical stability and facilitates the selection of learning rates.
DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs. We experimentally show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling, the latter particularly for challenging datasets such as SVHN. A public TF2 implementation is provided as well.
One-sentence Summary: A conceptually new approach for probabilistic image modeling based on multiple linked GMMs, which can generate samples of excellent quality w.r.t. related approaches, particularly for SVHN.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 4 code implementations](https://www.catalyzex.com/paper/arxiv:2203.11034/code)
1 Reply
Loading