Double Machine Learning Evaluation Under Distribution Shift and Selection Bias

Annie S Ulichney; Amanda Lee Coston

Double Machine Learning Evaluation Under Distribution Shift and Selection Bias

Annie S Ulichney, Amanda Lee Coston

Published: 29 Sept 2025, Last Modified: 12 Oct 2025NeurIPS 2025 - Reliable ML WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: covariate shift, selective labels, evaluation, double machine learning, distribution shift

TL;DR: We develop a double machine learning estimator for pre-deployment evaluation of prediction models under the joint presence of covariate shift and selective labels.

Abstract: Understanding how a model will perform when deployed in unseen environments is essential to preventing harm when algorithms inform decision-making. Two important drivers of model performance degradation are (i) \emph{covariate shift} where target the covariate distribution differs from the source and (ii) \emph{selective labels} where the observability of outcomes is influenced by the model itself. We study \emph{pre-deployment} model evaluation under the joint presence of covariate shift and selective labeling. In particular, we present a double machine learning estimation procedure for the risk of an arbitrary black-box prediction model for a given loss function. We show identification of this estimand under standard assumptions, and derive a bias-corrected estimator based on the influence function of the target risk. We demonstrate our proposed estimator through controlled synthetic data and semi-synthetic eICU data experiments, which show that our estimator tracks the true target risk more accurately than combining standard plug-in approaches.

Submission Number: 154

Loading