Lifelong Best-Arm Identification with Misspecified PriorsDownload PDF

Published: 20 Jul 2023, Last Modified: 30 Aug 2023EWRL16Readers: Everyone
Keywords: Bandits, Best arm identification, lifelong
TL;DR: It is possible and efficient to continuously learn an informative prior for fixed-budget best-arm identification in many lifelong bandit settings.
Abstract: We address the problem of lifelong fixed-budget best-arm identification (BAI), which arises in realistic sequential A/B testing scenarios where the value of each arm is correlated across test phases. We propose a hierarchical Gaussian generative model and develop a Bayesian fixed-budget BAI algorithm. Our main contribution is to investigate the impact of prior misspecification on the missidentification probability along the learning trajectory through an upper bound on a novel risk metric. We conduct extensive empirical evaluations of our algorithm against state-of-the-art methods on various types of martingales with different dependency structures. Our results show that our approach outperforms other algorithms across a wide range of settings.
