Common Functional Decompositions Can Mis-attribute Differences in Outcomes Between Populations

NeurIPS 2024 Workshop ATTRIB Submission41 Authors

Published: 30 Oct 2024, Last Modified: 21 Jan 2025ATTRIB 2024EveryoneRevisionsBibTeXCC BY 4.0
Release Opt Out: No, I don't wish to opt out of paper release. My paper should be released.
Keywords: Model decomposition, Functional ANOVA, Accumulated Local Effects, Kitagawa-Oaxaca-Blinder, feature importance, misattribution
Abstract: In science and social science, we often wish to explain why an outcome is different in two populations. For instance, if a jobs program benefits members of one city more than another, is that due to differences in program participants (particular covariates) or the local labor markets (outcomes given covariates)? The Kitagawa-Oaxaca-Blinder (KOB) decomposition is a standard tool in econometrics that explains the difference in the mean outcome across two populations. However, the KOB decomposition assumes a linear relationship between covariates and outcomes, while the true relationship may be meaningfully nonlinear. Modern machine learning boasts a variety of nonlinear functional decompositions for the relationship between outcomes and covariates in one population. It seems natural to extend the KOB decomposition using these functional decompositions. We observe that a successful extension should not attribute the differences to covariates — or, respectively, outcomes given covariates — if those are the same in the two populations. Unfortunately, we demonstrate that, even in simple examples, two common decompositions — the functional ANOVA and Accumulated Local Effects — can attribute differences to outcomes given covariates, even when they are identical in two populations. We provide and partially prove a conjecture that this misattribution arises in any additive decomposition that depends on the distribution of covariates.
Submission Number: 41
Loading