Keywords: Concept based model, Interpretability, Concept bottleneck models
TL;DR: This paper analyzes concept bottleneck models (CBMs) in terms of test-time intervention and evaluates different intervention methods.
Abstract: Concept bottleneck models (CBMs) are a class of interpretable neural network models that predict the target label of a given input based on its high-level concepts. Unlike other end-to-end deep learning models, CBMs enable domain experts to intervene on the predicted concepts at test time so that more accurate and reliable target predictions can be made. While the intervenability provides a powerful avenue of control, many aspects of the intervention procedure remain underexplored. In this work, we inspect the current intervention practice for its efficiency and reliability. Specifically, we first present an array of new intervention methods to significantly improve the target prediction accuracy for a given budget of intervention expense. We also bring attention to non-trivial yet unknown issues related to reliability and fairness of the intervention and discuss how to fix these problems in practice.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/a-closer-look-at-the-intervention-procedure/code)
3 Replies
Loading