Zero-shot Concept Bottleneck Models

Shin'ya Yamaguchi; Kosuke Nishida; Daiki Chijiwa; Yasutoshi Ida

Zero-shot Concept Bottleneck Models

Shin'ya Yamaguchi, Kosuke Nishida, Daiki Chijiwa, Yasutoshi Ida

18 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: concept bottleneck models, vision-language models

TL;DR: We introduce an interpretable and intervenable model family called zero-shot concept bottleneck models, which can provide concept-based explanations for its prediction in fully zero-shot manner.

Abstract: Concept bottleneck models (CBMs) are inherently interpretable and intervenable neural network models, which explain their final class label prediction via intermediate predictions of high-level semantic concepts. However, they require target task training to learn input-to-concept and concept-to-class mappings, which necessitates collecting target datasets and significant training resources. In this paper, we present zero-shot concept bottleneck models (Z-CBMs), which predict concepts and labels in a fully zero-shot manner without additional training of neural networks. Z-CBMs leverage a large-scale concept bank, comprising millions of vocabulary extracted from the web, to describe diverse inputs across various domains. For the input-to-concept mapping, we introduce concept retrieval, which dynamically identifies input-related concepts through cross-modal search within the concept bank. In the concept-to-class inference, we apply concept regression to select essential concepts from the retrieved concepts by sparse linear regression. Through extensive experiments, we demonstrate that our Z-CBMs provide interpretable and intervenable concepts without any additional training.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 10596

Loading