Lipschitz Bandits with Batched Feedback

Yasong Feng; Zengfeng Huang; Tianyu Wang

Lipschitz Bandits with Batched Feedback

Yasong Feng, Zengfeng Huang, Tianyu Wang

Published: 31 Oct 2022, Last Modified: 07 Oct 2022NeurIPS 2022 AcceptReaders: Everyone

Keywords: batched bandits, Lipschitz bandits

TL;DR: We introduce a novel landscape-aware algorithm, called Batched Lipschitz Narrowing (BLiN), that optimally solves Lipschitz bandits with batched feedback, and we provide theoretical lower bounds for this problem.

Abstract: In this paper, we study Lipschitz bandit problems with batched feedback, where the expected reward is Lipschitz and the reward observations are communicated to the player in batches. We introduce a novel landscape-aware algorithm, called Batched Lipschitz Narrowing (BLiN), that optimally solves this problem. Specifically, we show that for a $T$-step problem with Lipschitz reward of zooming dimension $d_z$, our algorithm achieves theoretically optimal (up to logarithmic factors) regret rate $\widetilde{\mathcal{O}}\left(T^{\frac{d_z+1}{d_z+2}}\right)$ using only $ \mathcal{O} \left( \log\log T\right) $ batches. We also provide complexity analysis for this problem. Our theoretical lower bound implies that $\Omega(\log\log T)$ batches are necessary for any algorithm to achieve the optimal regret. Thus, BLiN achieves optimal regret rate using minimal communication.

Supplementary Material: pdf

14 Replies

Loading