Detecting Shortcuts using Mutual Information

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Supplementary Material: pdf
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: shortcuts, spurious correlation, mutual information, information theory, neural tangent kernel
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Proposed a mutual-information based method to detect shortcuts/spurious correlations.
Abstract: The failure of deep neural networks to generalize to out-of-distribution (OOD) data is a well-known problem that raises concerns about the deployment of trained networks in safety-critical domains such as healthcare and autonomous vehicles. We study a particular kind of distribution shift — shortcuts or spurious correlations in the training data. These correlations are not present in real-world test data, so there is a performance drop due to distribution shift, also referred to as shortcut learning. Shortcut learning is often only exposed when models are evaluated in carefully controlled experimental settings, posing a serious dilemma for AI practitioners to properly assess the effectiveness of a trained model for real-world applications. In this work, we try to understand shortcut learning using information-theoretic tools and propose to use the mutual information (MI) between the learned representation and the input space as a domain-agnostic metric for detecting shortcuts in the training datasets. For studying the training dynamics of shortcut learning, we develop a Neural Tangent Kernel (NTK) based framework, which can be used to detect shortcuts and spurious correlations in the training data without requiring class labels of the test data. We empirically demonstrate on multiple datasets, such as MNIST, CelebA, NICO, Waterbirds, and BenchMD, that MI can effectively detect shortcuts. We benchmark against multiple OOD detection baselines to show that OOD detectors cannot detect shortcuts, and our method can be used in complementary with OOD detectors to identify all types of distribution shifts in the datasets, including shortcuts.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8022
Loading