Provable Concept Learning for Interpretable Predictions Using Variational AutoencodersDownload PDF

21 May 2022, 02:42 (modified: 15 Jun 2022, 10:12)ICML-AI4Science PosterReaders: Everyone
Keywords: Interpretable predictions, interpretable concepts, variational inference, variational autoencoders
TL;DR: We propose a VAE-based framework with provable guarantees for interpretable predictions.
Abstract: In safety-critical applications, practitioners are reluctant to trust neural networks when no interpretable explanations are available. Many attempts to provide such explanations revolve around pixel-level attributions or use previously known concepts. In this paper we aim to provide explanations by provably identifying \emph{high-level, previously unknown concepts}. To this end, we propose a probabilistic modeling framework to derive (C)oncept (L)earning and (P)rediction (CLAP) - a VAE-based classifier that uses visually interpretable concepts as linear predictors. Assuming that the data generating mechanism involves interpretable concepts, we prove that our method is able to identify them while attaining optimal classification accuracy. We use synthetic experiments for validation, and also show that on the ChestXRay dataset, CLAP effectively discovers interpretable factors for classifying diseases.
Track: Original Research Track
0 Replies