Decomposing Representation Drift via Interventions

Thomas Y Chen; Daniel Xu

Decomposing Representation Drift via Interventions

Thomas Y Chen, Daniel Xu

Published: 01 Mar 2026, Last Modified: 07 Mar 2026UCRL@ICLR2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: representation drift, causal inference, variational inference, distribution shift, identifiability

Abstract: Machine learning systems operate in nonstationary settings where both the data-generating environment and the model itself evolve, a phenomenon traditionally studied as dataset shift and concept drift. Prior work often treats internal representations as black-box features or latent variables to be inferred, preventing an estimable separation of drift due to environmental change versus model updates. We introduce a probabilistic causal framework that embeds the feature extractor as a node in a larger structural causal model (SCM). This yields an interventional decomposition of representation drift into environment- and model-driven terms, estimable via variational inference without requiring paired pre/post-update representations, with consistency guarantees on the estimator. We show the decomposition is well-defined under relative identifiability, and connect it to downstream performance through Integral Probability Metric (IPM) bounds. Empirical results validate relative identifiability and robustness of the drift decomposition.

Submission Number: 14

Loading