Keywords: Interpretability, Spurious Correlations, Robustness
TL;DR: We propose a new task and benchmark for doubly right object recognition, which predicts not only the right thing, but using the right reason.
Abstract: Existing deep neural networks are optimized to predict the right thing, yet they may rely on the wrong evidence. Using the wrong evidence for prediction undermines out-of-distribution generalization, underscoring the gap between machine perception and human perception. In this paper, we introduce an overlooked but important problem: ``doubly right object recognition,'' which requires the model not only to predict the right outcome, but also to use the right reasons that are aligned with human perception. The existing benchmarks fail to learn or evaluate the doubly right object recognition task, because both the right reason and spurious correlations are predictive of the final outcome. Without additional supervision and annotation for what is the right reason for recognition, doubly right object recognition is impossible. To address this, we collect a dataset, which contains annotated right reasons that are aligned with human perception and train a fully interpretable model that only uses the attributes from our collected dataset for object prediction. Through empirical experiments, we demonstrate that our method can train models that are more likely to predict the right thing with the right reason, providing additional generalization ability on ObjectNet, and demonstrating zero-shot learning ability.