Couple-VAE: Mitigating the Encoder-Decoder Incompatibility in Variational Text Modeling with Coupled Deterministic Networks

Chen Wu; Prince Zizhuang Wang; William Yang Wang

Couple-VAE: Mitigating the Encoder-Decoder Incompatibility in Variational Text Modeling with Coupled Deterministic Networks

Chen Wu, Prince Zizhuang Wang, William Yang Wang

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

Abstract: The variational autoencoder (VAE) combines latent variable models and amortized variational inference. Despite its theoretical attractiveness, the optimization of VAE for text modeling suffers from the posterior collapse problem, where the decoder ignores the latent codes, and the posterior becomes nearly identical to the prior. We demonstrate that the VAE training dynamics face the challenge of encoder-decoder incompatibility, in which the encoder receives scarce backpropagated gradients from the decoder, and little encoded information is passed to the decoder. We propose a model-agnostic approach, named Couple-VAE, to mitigate this issue. Specifically, we couple the VAE model with a deterministic network with the same structure, which is optimized with the reconstruction loss without any regularization (e.g., the KL divergence). To enrich the backpropagated gradients for the encoder, we share the encoder between the deterministic network and the stochastic network. To encourage nontrivial decoding signals, we propose a coupling loss that pushes the stochastic decoding signals to the deterministic ones. We conduct extensive experiments on the Penn Treebank, Yelp, and Yahoo. We apply the proposed method to various variational text modeling models with different regularization terms, posterior families, decoder architectures, and optimization strategies and observe consistently improved text modeling results in terms of probability estimation and the richness of the encoded text.

Keywords: variational autoencoders, posterior collapse, text modeling, amortized variational inference

Original Pdf: pdf

5 Replies

Loading