Simulating Concept Bottlenecks Using Chain-of-Thought Reasoning

ACL ARR 2025 February Submission3806 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In high-stakes domains like healthcare and finance, understanding why a model makes a prediction is often as important as the prediction itself. Concept Bottleneck Models (CBMs) enhance transparency by first providing interpretable concepts -- typically from an image -- before making the final prediction. This allows experts to validate and correct these intermediate concepts. In this paper, we show how CBMs can be effectively implemented using (Vision-)Language Models by leveraging their chain-of-thought reasoning. We fine-tune the model with the standard cross-entropy loss, and our approach maintains prediction quality and achieves high accuracy for intermediate concepts, effectively simulating CBMs without any architectural modifications. We demonstrate the effectiveness of our method on synthetic and real-world datasets, showing that it matches or exceeds the performance of traditional CBMs. Our method not only simplifies the implementation of CBMs but also leverages the extensive knowledge of VLMs acquired during pretraining.
Paper Type: Short
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: explanation faithfulness,free-text/natural language explanations,hierarchical & concept explanations
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 3806
Loading