Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms

William Réveillard; Richard Combes

Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms

William Réveillard, Richard Combes

Published: 18 Sept 2025, Last Modified: 10 Dec 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-armed bandits, Structured bandits, Non-convex optimization

TL;DR: We develop a computationally tractable algorithm to solve the Graves-Lai problem for multimodal bandits.

Abstract: We consider a stochastic multi-armed bandit problem with i.i.d. rewards where the expected reward function is multimodal with at most $m$ modes. We propose the first known computationally tractable algorithm for computing the solution to the Graves-Lai optimization problem, which in turn enables the implementation of asymptotically optimal algorithms for this bandit problem.

Supplementary Material: zip

Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)

Submission Number: 20901

Loading