Keywords: Interpretable ML, User Study, Human Subject Evaluation, Invertible Neural Networks, Convolutional Networks
Abstract: A variety of methods exist to explain image classification models. However, whether they provide any benefit to users over simply comparing various inputs and the model’s respective predictions remains unclear. We conducted a user study (N=240) to test how such a baseline explanation technique performs against concept-based and counterfactual explanations. To this end, we contribute a synthetic dataset generator capable of biasing individual attributes and quantifying their relevance to the model. In a study, we assess if participants can identify the relevant set of attributes compared to the ground-truth. Our results show that the baseline outperformed concept-based explanations. Counterfactual explanations from an invertible neural network performed similarly as the baseline. Still, they allowed users to identify some attributes more accurately. Our results highlight the importance of measuring how well users can reason about biases of a model, rather than solely relying on technical evaluations or proxy tasks. We open-source our study and dataset so it can serve as a blue-print for future studies.
One-sentence Summary: Do Users Benefit From Interpretable Vision? A User Study, Baseline, And Dataset