Learning Consistent Deep Generative Models from Sparsely Labeled Data

Gabriel Hope; Madina Abdrakhmanova; Xiaoyin Chen; Michael C Hughes; Erik B. Sudderth

Learning Consistent Deep Generative Models from Sparsely Labeled Data

Gabriel Hope, Madina Abdrakhmanova, Xiaoyin Chen, Michael C Hughes, Erik B. Sudderth

Published: 29 Jan 2022, Last Modified: 05 May 2023AABI 2022 PosterReaders: Everyone

Keywords: Deep generative models, variational autoencoders, semisupervised learning

TL;DR: We introduce a new approach for semi-supervised learning with variational autoencoders that addresses several issues in prior work.

Abstract: We consider training deep generative models toward two simultaneous goals: discriminative classification and generative modeling using an explicit likelihood. While variational autoencoders (VAEs) offer a promising solution, we show that the dominant approach to training semi-supervised VAEs has several key weaknesses: it is fragile as generative modeling capacity increases, it is slow due to a required marginalization over labels, and it incoherently decouples into separate discriminative and generative models when all data is labeled. We remedy these concerns in a new proposed framework for semi-supervised VAE training that considers a more coherent downstream model architecture and a new objective which maximizes generative quality subject to a task-specific prediction constraint that ensures discriminative quality. We further enforce a consistency constraint, derived naturally from the generative model, that requires predictions on reconstructed data to match those on the original data. We show that our contributions -- a downstream architecture with prediction constraints and consistency constraints -- lead to improved generative samples as well as accurate image classification, with consistency particularly crucial for accuracy on sparsely-labeled datasets. Our approach enables advances in generative modeling to directly boost semi-supervised classification, an ability we demonstrate by learning a "very deep" prediction-constrained VAE with many layers of latent variables.

1 Reply

Loading