Provable Concept Learning for Interpretable Predictions Using Variational Autoencoders

Armeen Taeb; Nicolò Ruggeri; Carina Schnuck; Fanny Yang

Provable Concept Learning for Interpretable Predictions Using Variational Autoencoders

Armeen Taeb, Nicolò Ruggeri, Carina Schnuck, Fanny Yang

Published: 15 Jun 2022, Last Modified: 04 May 2025ICML-AI4Science PosterReaders: Everyone

Keywords: Interpretable predictions, interpretable concepts, variational inference, variational autoencoders

TL;DR: We propose a VAE-based framework with provable guarantees for interpretable predictions.

Abstract: In safety-critical applications, practitioners are reluctant to trust neural networks when no interpretable explanations are available. Many attempts to provide such explanations revolve around pixel-level attributions or use previously known concepts. In this paper we aim to provide explanations by provably identifying \emph{high-level, previously unknown concepts}. To this end, we propose a probabilistic modeling framework to derive (C)oncept (L)earning and (P)rediction (CLAP) - a VAE-based classifier that uses visually interpretable concepts as linear predictors. Assuming that the data generating mechanism involves interpretable concepts, we prove that our method is able to identify them while attaining optimal classification accuracy. We use synthetic experiments for validation, and also show that on the ChestXRay dataset, CLAP effectively discovers interpretable factors for classifying diseases.

Track: Original Research Track

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/provable-concept-learning-for-interpretable/code)

0 Replies

Loading