Enhancing Concept Localization in CLIP-based Concept Bottleneck Models

Rémi Kazmierczak; Steve Azzolin; Goran Frehse; Eloïse Berthier; Gianni Franchi

Enhancing Concept Localization in CLIP-based Concept Bottleneck Models

Rémi Kazmierczak, Steve Azzolin, Goran Frehse, Eloïse Berthier, Gianni Franchi

Published: 30 Jan 2026, Last Modified: 30 Jan 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: This paper addresses explainable AI (XAI) through the lens of Concept Bottleneck Models (CBMs) that do not require explicit concept annotations, relying instead on concepts extracted using CLIP in a zero-shot manner. We show that CLIP, which is central in these techniques, is prone to concept hallucination—incorrectly predicting the presence or absence of concepts within an image in scenarios used in numerous CBMs, hence undermining the faithfulness of explanations. To mitigate this issue, we introduce Concept Hallucination Inhibition via Localized Interpretability (CHILI), a technique that disentangles image embeddings and localizes pixels corresponding to target concepts. Furthermore, our approach supports the generation of saliency-based explanations that are more interpretable.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Chen_Sun1

Submission Number: 6082

Loading