Almost Optimal Variance-Constrained Best Arm Identification

Yunlong Hou, Vincent Y. F. Tan, Zixin Zhong

Published: 2023, Last Modified: 12 May 2023IEEE Trans. Inf. Theory 2023Readers: Everyone

Abstract: We design and analyze Variance-Aware-Lower and Upper Confidence Bound (VA-LUCB), a parameter-free algorithm, for identifying the best arm under the fixed-confidence setup and under a stringent constraint that the variance of the chosen arm is strictly smaller than a given threshold. An upper bound on VA-LUCB’s sample complexity is shown to be characterized by a fundamental variance-aware hardness quantity <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$H_{\mathrm {VA}}$ </tex-math></inline-formula> . By proving an information-theoretic lower bound, we show that sample complexity of VA-LUCB is optimal up to a factor logarithmic in <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$H_{\mathrm {VA}}$ </tex-math></inline-formula> . Extensive experiments corroborate the dependence of the sample complexity on the various terms in <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$H_{\mathrm {VA}}$ </tex-math></inline-formula> . By comparing VA-LUCB’s empirical performance to a close competitor RiskAverse-UCB-BAI by David et al. (2018) our experiments suggest that VA-LUCB has the lowest sample complexity for this class of risk-constrained best arm identification problems, especially for the riskiest instances.

0 Replies