Concept Gradient: Concept-based Interpretation Without Linear Assumption

Andrew Bai; Chih-Kuan Yeh; Neil Y.C. Lin; Pradeep Kumar Ravikumar; Cho-Jui Hsieh

Concept Gradient: Concept-based Interpretation Without Linear Assumption

Andrew Bai, Chih-Kuan Yeh, Neil Y.C. Lin, Pradeep Kumar Ravikumar, Cho-Jui Hsieh

Published: 01 Feb 2023, Last Modified: 02 Mar 2023ICLR 2023 posterReaders: Everyone

Keywords: Interpretability, Concept-based interpretation, XAI

TL;DR: Extending concept-based gradient interpretation to non-linear concept functions.

Abstract: Concept-based interpretations of black-box models are often more intuitive for humans to understand. The most widely adopted approach for concept-based, gradient interpretation is Concept Activation Vector (CAV). CAV relies on learning a linear relation between some latent representation of a given model and concepts. The premise of meaningful concepts lying in a linear subspace of model layers is usually implicitly assumed but does not hold true in general. In this work we proposed Concept Gradient (CG), which extends concept-based, gradient interpretation methods to non-linear concept functions. We showed that for a general (potentially non-linear) concept, we can mathematically measure how a small change of concept affects the model’s prediction, which is an extension of gradient-based interpretation to the concept space. We demonstrated empirically that CG outperforms CAV in attributing concept importance on real world datasets and performed case study on a medical dataset. The code is available at github.com/jybai/concept-gradients.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)

Supplementary Material: zip

19 Replies

Loading