Nearly Optimal Best Arm Identification for Semiparametric Bandits

Published: 03 Feb 2026, Last Modified: 03 Feb 2026AISTATS 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We study the fixed confidence Best Arm Identification (BAI) problem in semiparametric bandits, where rewards follow a linear model with an unknown additive baseline shift. While BAI is well understood for linear bandits, optimality in this semiparametric setting has remained open. Firstly, we establish a sample complexity lower bound for semiparametric bandit BAI problem. Next, we propose an efficient BAI algorithm and proved it's sample complexity matches lower bound up to logarithmic factors. We also extend our results to transductive BAI problem and obtained nearly optimal results.
Submission Number: 1927
Loading