Adaptive A/B Testing under Nonstationary Dynamics using State-Space Models

Published: 03 Feb 2026, Last Modified: 03 Feb 2026AISTATS 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We propose a Kalman filter–based response-adaptive randomization framework that adapts A/B testing to nonstationary environments, improving efficiency under shifting treatment effects and variances.
Abstract: A/B testing is central to evaluating how modifications to products, services, and user experiences impact user outcomes. Yet in practice, experiments rarely occur in stationary environments: seasonality, feature launches, and dynamically evolved user demographics make the underlying treatment effects shift over time. Conventional fixed-allocation designs fail to adapt to this nonstationarity, relying on static treatment allocations that potentially compromise estimation efficiency and lead to inefficient use of experimental resources. Response-adaptive randomization (RAR) design provides a natural alternative, adaptively allocating participants over time based on accrued information. However, deploying RAR designs in nonstationary environments raises fundamental challenges: the underlying treatment effects drift over time, noise levels could vary, and continuous monitoring is required to maintain valid statistical inference. In this work, we propose a methodology framework that addresses these challenges. On the one hand, we model period-level treatment arm means as autoregressive state-space processes and develop a Kalman filter estimator that exploits temporal dependence. On the other hand, we propose an RAR design that accommodates nonstationarity by incorporating state uncertainty via predicted Kalman variances. Our theoretical analysis establishes asymptotic normality of the treatment effect estimator, establishes asymptotic normality, compares relative efficiency, and enables the construction of anytime-valid confidence sequences for continuous monitoring. Simulation studies demonstrate that our method is significantly more efficient than a benchmark time-averaging estimator and fixed allocation strategy, particularly under treatment effect drift and variance imbalance.
Submission Number: 2106
Loading