Multi-modal Variational Encoder-Decoders

Iulian V. Serban, Alexander G. Ororbia II, Joelle Pineau, Aaron Courville

Nov 04, 2016 (modified: Dec 11, 2016) ICLR 2017 conference submission readers: everyone
  • Abstract: Recent advances in neural variational inference have facilitated efficient training of powerful directed graphical models with continuous latent variables, such as variational autoencoders. However, these models usually assume simple, uni-modal priors — such as the multivariate Gaussian distribution — yet many real-world data distributions are highly complex and multi-modal. Examples of complex and multi-modal distributions range from topics in newswire text to conversational dialogue responses. When such latent variable models are applied to these domains, the restriction of the simple, uni-modal prior hinders the overall expressivity of the learned model as it cannot possibly capture more complex aspects of the data distribution. To overcome this critical restriction, we propose a flexible, simple prior distribution which can be learned efficiently and potentially capture an exponential number of modes of a target distribution. We develop the multi-modal variational encoder-decoder framework and investigate the effectiveness of the proposed prior in several natural language processing modeling tasks, including document modeling and dialogue modeling.
  • TL;DR: Learning continuous multimodal latent variables in the variational auto-encoder framework for text processing applications.
  • Keywords: Deep learning, Structured prediction, Natural language processing
  • Conflicts: umontreal.ca, psu.edu, cs.mcgill.ca

Loading