Towards Reasonable Concept Bottlenecks

Towards Reasonable Concept Bottlenecks

ICLR 2026 Conference Submission18394 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Concept Bottleneck Models, Interpretability, Concepts, Reasoning, Explainable AI, Interventions, Leakage

TL;DR: We propose a flexible and efficient framework for incorporating and supplementing prior knowledge in CBMs.

Abstract: We propose a novel, flexible, and efficient framework for designing Concept Bottleneck Models (CBMs) that enables practitioners to explicitly encode any of their prior knowledge and beliefs about the concept-concept ($C-C$) and concept-task ($C \to Y$) relationships into the model reasoning. The resulting **C**oncept **REA**soning **M**odels (CREAMs) architecturally encode potentially sparse $C \to Y$ relationships, as well as various types of $C-C$ relationships such as mutual exclusivity, hierarchical associations, and/or correlations. Moreover, CREAM can include a regularized side-channel to complement the potentially incomplete concept sets, achieving competitive task performance while encouraging predictions to be concept-grounded. Our experiments show that, without additional computational overhead, the CREAM designs: (i) allow for efficient and accurate interventions by avoiding leakage; and (ii) achieve task performance on par with black-box models.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 18394

Loading