Abstract: Variational autoencoders (VAEs) are one of the most popular unsupervised generative models that rely on learning latent representations of data. In this article, we extend the classical concept of Gaussian mixtures into the deep variational framework by proposing a mixture of VAEs (MVAE). Each component in the MVAE model is implemented by a variational encoder and has an associated subdecoder. The separation between the latent spaces modeled by different encoders is enforced using the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$d$ </tex-math></inline-formula> -variable Hilbert–Schmidt independence criterion (dHSIC). Each component would capture different data variational features. We also propose a mechanism for finding the appropriate number of VAE components for a given task, leading to an optimal architecture. The differentiable categorical Gumbel-softmax distribution is used in order to generate dropout masking parameters within the end-to-end backpropagation training framework. Extensive experiments show that the proposed MVAE model can learn a rich latent data representation and is able to discover additional underlying data representation factors.
0 Replies
Loading