Explaining Concept Shift with Interpretable Feature Attribution

Explaining Concept Shift with Interpretable Feature Attribution

ICLR 2026 Conference Submission19832 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: distribution shift, generalized additive model, knockoffs, explainable machine learning

Abstract: Regardless the amount of data a machine learning (ML) model is trained on, there will inevitably be data that differs from their training set, lowering model performance. Concept shift occurs when the distribution of labels conditioned on the features changes, making even a well-tuned ML model to have learned a fundamentally incorrect representation. Identifying these shifted features provides unique insight into how one dataset differs from another, considering the difference may be across a scientifically relevant dimension, such as time, disease status, population, etc. In this paper, we propose SGShift, a model for detecting concept shift in tabular data and attributing reduced model performance to a sparse set of shifted features. We frame concept shift as a feature selection task to learn the features that can explain performance differences between models in the source and target domain. This framework enables SGShift to adapt powerful statistical tools such as generalized additive models, knockoffs, and absorption towards identifying these shifted features. We conduct extensive experiments in synthetic and real data across various ML models and find SGShift can identify shifted features with AUC > 0.9, much higher than baseline methods, requires few samples in the shifted domain, and is robust in complex cases of concept shift. Applying SGShift to 2 real world cases in healthcare and genetics yielded new feature-level explanations of concept shift, including respiratory failure’s reduced impact on COVID-19 severity after Omicron and European-specific rare variants’ impact on Lupus prevalence.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 19832

Loading