Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions
Keywords: Interpretability, Concepts, Causal Representation Learning
TL;DR: We describe a framework that provides theoretical guarantees on the correctness of learning concepts from data and on the number of required labels.
Abstract: Machine learning is a vital part of many real-world systems, but several concerns
remain about the lack of interpretability, explainability and robustness of black-box
AI systems. Concept Bottleneck Models (CBM) address some of these challenges
by learning interpretable concepts from high-dimensional data, e.g. images, which
are used to predict labels. An important issue in CBMs are spurious correlation
between concepts, which effectively lead to learning “wrong” concepts. Current
mitigating strategies have strong assumptions, e.g., they assume that the concepts
are statistically independent of each other, or require substantial interaction in
terms of both interventions and labels provided by annotators. In this paper, we
describe a framework that provides theoretical guarantees on the correctness of
the learned concepts and on the number of required labels, without requiring any
interventions. Our framework leverages causal representation learning (CRL)
methods to learn latent causal variables from high-dimensional observations in
a unsupervised way, and then learns to align these variables with interpretable
concepts with few concept labels. We propose a linear and a non-parametric
estimator for this mapping, providing a finite-sample high probability result in the
linear case and an asymptotic consistency result for the non-parametric estimator.
We evaluate our framework in synthetic and image benchmarks, showing that the
learned concepts have less impurities and are often more accurate than other CBMs,
even in settings with strong correlations between concepts.
Supplementary Material: zip
Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 10711
Loading