Keywords: vision-language models, interpretability, explainability, multi-modal learning
TL;DR: We propose a interpretable model based on vision-language foundation models, which generates a text explanation from the input and then predicts a final task prediction based on the generated explanation.
Abstract: This paper proposes a novel interpretable model called explanation bottleneck models (XBMs), which are based on vision-language foundation models. XBMs generate a text explanation from the input and then predict a final task prediction based on the generated explanation by leveraging pre-trained vision-language encoder-decoder models. To achieve both the target task performance and the explanation quality, we train XBMs through the target task loss with the regularization penalizing the explanation decoder via the distillation from the frozen pre-trained decoder. Our experiments confirm that XBMs provide accurate and fluent natural language explanations, and the explanation can be intervened by human feedback.
Email Of Author Nominated As Reviewer: y.shinya.kml@gmail.com
Submission Number: 5
Loading