Variational Sequential Modeling, Learning and Understanding

Jen-Tzung Chien, Chih-Jung Tsai

Published: 2021, Last Modified: 04 Nov 2025ASRU 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Normalizing flow comprises of a series of invertible transformations. With careful design in transformations, it can generate images or speeches with fast sampling speed. Inference can also be efficient in maximum likelihood manner. In addition to generating scenes or human faces, it can be used to transform probability distributions. On the other hand, learning latent structures of sentences in a global manner is always challenging. Variational autoencoder (VAE) is haunted by the issue of posterior collapse, where the latent space is poorly learned. To improve inference and generation of VAE in learning sequence data, we propose the amortized flow posterior variational recurrent autoencoder (AFP-VRAE). Variational recurrent autoencoder (VRAE) has RNN based encoder and decoder and learns global representations of sentences. To learn latent space that well preserves the semantic information of data, we use the normalizing flow to generate flexible variational distributions. Furthermore, we adopt the amortized regularization to encode similar embeddings to neighboring latent representations, and we use the skip connections to reinforce the representations to predict every output directly. The benefits can be shown in the experiments as we evaluate the models for language modeling, sentiment analysis and document summarization. AFP-VRAE reports good results on variational modeling for sequence data.

External IDs:dblp:conf/asru/ChienT21