Deep Generative Models for learning Coherent Latent Representations from Multi-Modal Data

Sep 27, 2018 ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: The application of multi-modal generative models by means of a Variational Auto Encoder (VAE) is an upcoming research topic for sensor fusion and bi-directional modality exchange. This contribution gives insights into the learned joint latent representation and shows that expressiveness and coherence are decisive properties for multi-modal datasets. Furthermore, we propose a multi-modal VAE derived from the full joint marginal log-likelihood that is able to learn the most meaningful representation for ambiguous observations. Since the properties of multi-modal sensor setups are essential for our approach but hardly available, we also propose a technique to generate correlated datasets from uni-modal ones.
  • Keywords: Multi-Modal Deep Generative Models, Sensor Fusion, Data Generation, VAE
  • TL;DR: Deriving a general formulation of a multi-modal VAE from the joint marginal log-likelihood.
0 Replies