Keywords: Shapley values, Time Series, Drift detection
TL;DR: Using Shapley values for distributional shift detection and visualization
Abstract: In streaming data, distributional shifts can appear both in the univariate dimensions
and in the joint distributions with the labels. However, in many real-time scenarios,
labels are often either missing or delayed; Unsupervised drift detection methods
are desired in those applications.
We design slidSHAPs, a novel representation method for unlabelled data streams.
Commonly known in machine learning models, Shapley values offer a way to
exploit correlation dependencies among random variables; We develop an unsuper-
vised sliding Shapley value series for categorical time series representing the data
stream in a newly defined latent space and track the feature correlation changes.
Transforming the original time series to the slidSHAPs allows us to track how
distributional shifts affect the correlations among the input variables; the approach
is independent of any kind of labeling. We show how abrupt distributional shifts
in the input variables are transformed into smoother changes in the slidSHAPs;
Moreover, slidSHAP allows for intuitive visualization of the shifts when they are
not observable in the original data.
0 Replies
Loading