Keywords: Multimodal Machine Learning, Representation Learning, AutoEncoders
Abstract: Multi-view data has become ubiquitous, especially with multi-sensor systems like self-driving cars or medical patient-side monitors.
We look at modeling multi-view data through robust representation learning, with the goal of leveraging relationships between views and building resilience to missing information.
We propose a new flavor of multi-view AutoEncoders, the Robust Multi-view AutoEncoder, which explicitly encourages robustness to missing views.
The principle we use is straightforward: we apply the idea of drop-out to the level of views.
During training, we leave out views as input to our model while forcing it to reconstruct all of them.
We also consider a flow-based generative modeling extension of our approach in the case where all the views are available.
We conduct experiments for different scenarios: directly using the learned representations for reconstruction, as well as a two-step process where the learned representation is subsequently used as features for the data for a down-stream application.
Our synthetic and real-world experiments show promising results for the application of these models to robust representation learning.
One-sentence Summary: Robust representation learning for multi-view data using two-tiered AutoEncoder + FlowBased generative modeling extension.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=0NkIoznL87
5 Replies
Loading