Causal Analysis of Representation Drift for Robust Deployment
Keywords: representation drift, causal attribution, distribution shift robustness, inference-time monitoring, trustworthy multimodal AI
Abstract: Machine learning systems operate in nonstationary settings where both the data-generating environment and the model itself evolve, a phenomenon traditionally studied as dataset shift and concept drift. Prior work often treats internal representations as black-box features or latent variables to be inferred, preventing an estimable separation of drift due to environmental change versus model updates. We introduce a probabilistic causal framework that embeds the feature extractor as a node in a larger structural causal model (SCM). This yields an interventional decomposition of representation drift into environment- and model-driven terms, estimable via variational inference without requiring paired pre/post-update representations, with consistency guarantees on the estimator. We show the decomposition is well-defined under relative identifiability, and connect it to downstream performance through exact causal risk analysis and Integral Probability Metric (IPM) bounds. Furthermore, we address the path-dependency of sequential updates by proposing an order-free, Shapley-style attribution method. Empirical results validate relative identifiability and robustness of the drift decomposition.
Submission Number: 140
Loading