Keywords: reliance, overreliance, human-computer interaction, human-subject experiment, dnf, score function, trust, bias
TL;DR: Interpretability causes overreliance on AI model's predictions that changes with how the model is represented
Abstract: One of the underlying drivers to create interpretable models is that they may help humans make better decisions. Given an interpretable model, a human decision-maker may be able to better understand the model's reasoning and incorporate its insights into their own decision-making process. Whether this effect occurs in practice is difficult to validate. It requires accounting for individuals' prior beliefs and objectively measuring when reliance on the model goes beyond what is reasonable given the available information. In this work, we address these challenges and validate if interpretability improves decision-making. Concretely, we compare how humans make decisions given a black-box model and an interpretable model, while controlling for their prior beliefs and rigorously quantifying rational behavior. Our results show that interpretable models can lead to overreliance and that the level of overreliance varies across models that we would consider to be equally interpretable. These findings raise fundamental concerns about current approaches to AI-assisted decision-making. They suggest that making models transparent is insufficient---and currently counterproductive---for promoting appropriate reliance.
Track: Main track
Submitted Paper: No
Published Paper: No
Submission Number: 58
Loading