Mixture Martingales Revisited with Applications to Sequential Tests and Confidence IntervalsDownload PDFOpen Website

2018 (modified: 08 Nov 2022)CoRR 2018Readers: Everyone
Abstract: This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model. The deviations are measured using the Kullback-Leibler divergence in a given one-dimensional exponential family, and may take into account several arms at a time. They are obtained by constructing for each arm a mixture martingale based on a hierarchical prior, and by multiplying those martingales. Our deviation inequalities allow us to analyze stopping rules based on generalized likelihood ratios for a large class of sequential identification problems, and to construct tight confidence intervals for some functions of the means of the arms.
0 Replies

Loading