Asymptotic Optimality for Decentralised Bandits

Conor Newton, Ayalvadi Ganesh, Henry W. J. Reeve

Published: 2022, Last Modified: 16 May 2023SIGMETRICS Perform. Evaluation Rev. 2022Readers: Everyone

Abstract: We consider a large number of agents collaborating on a multi-armed bandit problem with a large number of arms. We present an algorithm which improves upon the Gossip- Insert-Eliminate method of Chawla et al. [3]. We provide a regret bound which shows that our algorithm is asymptotically optimal and present empirical results demonstrating lower regret on simulated data.

0 Replies