Concept-based Explanations for Out-of-Distribution Detectors

Jihye Choi; Jayaram Raghuram; Ryan Feng; Jiefeng Chen; Somesh Jha; Atul Prakash

Concept-based Explanations for Out-of-Distribution Detectors

Jihye Choi, Jayaram Raghuram, Ryan Feng, Jiefeng Chen, Somesh Jha, Atul Prakash

Published: 01 Feb 2023, Last Modified: 22 Jun 2025Submitted to ICLR 2023Readers: Everyone

Keywords: out-of-distribution detection, interpretability, concept-based explanations

TL;DR: We propose the first work to provide concept-based explanations for out-of-distribution detectors.

Abstract: Out-of-distribution (OOD) detection plays a crucial role in ensuring the safe deployment of deep neural network (DNN) classifiers. While a myriad of methods have focused on improving the performance of OOD detectors, a critical gap remains in interpreting their decisions. We help bridge this gap by providing explanations for OOD detectors based on learned high-level concepts. We first propose two new metrics for assessing the effectiveness of a particular set of concepts for explaining OOD detectors: 1) $\textit{detection completeness}$, which quantifies the sufficiency of concepts for explaining an OOD-detector's decisions, and 2) $\textit{concept separability}$, which captures the distributional separation between in-distribution and OOD data in the concept space. Based on these metrics, we propose a framework for learning a set of concepts that satisfy the desired properties of detection completeness and concept separability, and demonstrate the framework's effectiveness in providing concept-based explanations for diverse OOD detection techniques. We also show how to identify prominent concepts that contribute to the detection results via a modified Shapley value-based importance score.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/concept-based-explanations-for-out-of/code)

7 Replies

Loading