Toward Explanation Bottleneck Models

Shin'ya Yamaguchi; Kosuke Nishida

Toward Explanation Bottleneck Models

Shin'ya Yamaguchi, Kosuke Nishida

Published: 09 Oct 2024, Last Modified: 15 Dec 2024MINT@NeurIPS2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: vision-language models, interpretability, explainability, multi-modal learning

TL;DR: We propose a interpretable model based on vision-language foundation models, which generates a text explanation from the input and then predicts a final task prediction based on the generated explanation.

Abstract:

This paper proposes a novel interpretable model called explanation bottleneck models (XBMs), which are based on vision-language foundation models. XBMs generate a text explanation from the input and then predict a final task prediction based on the generated explanation by leveraging pre-trained vision-language encoder-decoder models. To achieve both the target task performance and the explanation quality, we train XBMs through the target task loss with the regularization penalizing the explanation decoder via the distillation from the frozen pre-trained decoder. Our experiments confirm that XBMs provide accurate and fluent natural language explanations, and the explanation can be intervened by human feedback.

Email Of Author Nominated As Reviewer: y.shinya.kml@gmail.com

Submission Number: 5

Loading