Latent Variable Autoencoder

Wenjuan Han, Ge Wang, Kewei Tu

2019 (modified: 16 Apr 2023)IEEE Access 2019Readers: Everyone

Abstract: Learning to discover hidden variables from unlabeled data is an important task. Traditional generative methods model the generation process of the observed variables as well as the hidden variables. However, tractable inference and learning on these models requires strong conditional independence assumptions being made among observed and hidden variables. To tackle this limitation, we propose an autoencoder framework. The encoder produces an intermediate representation from the observed variables, and the decoder is a generative latent variable model conditioned on the intermediate representation that tries to generate the hidden variables as well as to reconstruct the observed variables. We introduce three variant models of our framework with either a deterministic or a stochastic encoding process. To optimize our model, we propose an algorithm similar to the classic expectation-maximization (EM) algorithm that supports online learning for large-scale datasets. The flexibility of our framework allows us to apply it to various scenarios where the explicit inference of hidden variables is desired. We discuss the applications of our framework to the perceptual grouping task and the part-of-speech (POS) induction task. Our experiments on the two tasks demonstrate that our framework can achieve better performance than vanilla latent variable generative models.

0 Replies