Best Arm Identification with Correlated Sampling

17 Sept 2025 (modified: 23 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Best arm identification; Sample Complexity; Correlated Sampling; Fixed-Confidence
Abstract: Best arm identification (BAI) is an important research topic in sequential decision-making. In the fixed-confidence setting, the sample complexity, i.e., the number of samples needed to guarantee a given confidence level, serves as a fundamental metric for evaluating algorithms. Gariver&Kaufmann (2016) provided a tight characterization of this complexity as $\mathcal{H}^{\star}\log(1/\delta)$, where $\mathcal{H}^{\star}$ captures the problem hardness and $\delta$ is the confidence parameter. We improve this best-known bound to $\mathcal{T}^{\star}\log(1/\delta)$ with a strictly smaller hardness parameter $\mathcal{T}^{\star}$. Our approach is based on correlated sampling, which requires no assumptions on the reward function or the arm structures. A key theoretical challenge is that the resulting lower bound is defined by a non-convex optimization problem. To solve it, we propose an efficient method that decomposes the feasible region into sub-intervals and identifies local optima within each. Moreover, we propose the first correlated-sampling-based BAI algorithm, CORSA, and prove its asymptotic optimality. Finally, we conduct numerical experiments to evaluate the algorithm's performance.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 9803
Loading