Why are you predicting this class?

Claudia V. Goldman, Michael Baltaxe

Published: 2021, Last Modified: 01 Nov 2025IV 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Big data-driven learning models are created by training connectionist models. With the increase in computing power and memory size, these models are becoming practical solutions for predicting image classifications, driving trajectories and users' behaviors. Although these models can be shown to perform with high accuracy, this success measure is not enough to understand why the network predicts certain outputs for certain inputs. These networks behave as black boxes, able of processing very large amounts of data, without being transparent about their inner workings. This paper extends the architecture of a convolutional neural network and trains only the new connections to output an explanation for every prediction of the original classifier. The explanations are taken from a semantic language that is either computed or annotated from available data. Our work includes (1) defining and computing a language relevant to the classifier domain and semantically understandable by humans (2) computing the explanatory layer of the original network (3) training the extended architecture without changing the original given weights and (4) formatting the explanations in a user understandable manner. We applied our algorithmic solution to two existing classifiers in the automated driving domain. We showed successful results explaining predictive classifications of driving comfort and driving trajectories.

External IDs:dblp:conf/ivs/GoldmanB21