Towards Formalizing Spuriousness of Biased Datasets Using Partial Information Decomposition

Published: 13 Nov 2025, Last Modified: 13 Nov 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Spuriousness arises when there is an association between two or more variables in a dataset that are not causally related. In this work, we propose an explainability framework to preemptively disentangle the nature of such spurious associations in a dataset before model training. We leverage a body of work in information theory called Partial Information Decomposition (PID) to decompose the total information about the target into four non-negative quantities, namely unique information (in core and spurious features, respectively), redundant information, and synergistic information. Our framework helps anticipate when the core or spurious feature is indispensable, when either suffices, and when both are jointly needed for an optimal classifier trained on the dataset. Next, we leverage this decomposition to propose a novel measure of the spuriousness of a dataset. We arrive at this measure systematically by examining several candidate measures, and demonstrating what they capture and miss through intuitive canonical examples and counterexamples. Our framework Spurious Disentangler consists of segmentation, dimensionality reduction, and estimation modules, with capabilities to specifically handle high-dimensional image data efficiently. Finally, we also perform empirical evaluation to demonstrate the trends of unique, redundant, and synergistic information, as well as our proposed spuriousness measure across $6$ benchmark datasets under various experimental settings. We observe an agreement between our preemptive measure of dataset spuriousness and post-training model generalization metrics such as worst-group accuracy, further supporting our proposition. The code is available at https://github.com/Barproda/spuriousness-disentangler.
Submission Length: Regular submission (no more than 12 pages of main content)
Video: https://www.youtube.com/watch?v=xJKWBi_nRKg
Code: https://github.com/Barproda/spuriousness-disentangler
Assigned Action Editor: ~Amit_Sharma3
Submission Number: 4920
Loading