Time-uniform confidence bands for the CDF under nonstationarity

Paul Mineiro; Steven R Howard

Time-uniform confidence bands for the CDF under nonstationarity

Paul Mineiro, Steven R Howard

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: off-policy evaluation, anytime-valid

TL;DR: CDF estimation when DKW doesn't apply, e.g., controlled experiments in nonstationary settings

Abstract: Estimation of a complete univariate distribution from a sequence of observations is a useful primitive for both manual and automated decision making. This problem has received extensive attention in the i.i.d. setting, but the arbitrary data dependent setting remains largely unaddressed. We present computationally felicitous time-uniform and value-uniform bounds on the CDF of the running averaged conditional distribution of a sequence of real-valued random variables. Consistent with known impossibility results, our CDF bounds are always valid but sometimes trivial when the instance is too hard, and we give an instance-dependent convergence guarantee. The importance-weighted extension is appropriate for estimating complete counterfactual distributions of rewards given data from a randomized experiment, e.g., from an A/B test or a contextual bandit.

Supplementary Material: zip

Submission Number: 3754

Loading