Explanation Shift: How Did the Distribution Shift Impact the Model?

TMLR Paper3065 Authors

24 Jul 2024 (modified: 17 Sept 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The performance of machine learning models on new data is critical for their success in real-world applications. Current methods to detect shifts in the input or output data distributions have limitations in identifying model behaviour changes when no labelled data is available. In this paper, we define \emph{explanation shift} as the statistical comparison between how predictions from training data are explained and how predictions on new data are explained. We propose explanation shift as a key indicator to investigate the interaction between distribution shifts and learned models. We introduce an Explanation Shift Detector that operates on the explanation distributions, providing more sensitive and explainable changes in interactions between distribution shifts and learned models. We compare explanation shifts with other methods that are based on distribution shifts, showing that monitoring for explanation shifts results in more sensitive indicators for varying model behavior. We provide theoretical and experimental evidence and demonstrate the effectiveness of our approach on synthetic and real data. Additionally, we release an open-source Python package, \texttt{skshift}, which implements our method and provides usage tutorials for further reproducibility.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: no
Assigned Action Editor: ~Ikko_Yamane1
Submission Number: 3065
Loading