Collaborative Regret Minimization in Multi-Armed BanditsDownload PDFOpen Website

2023 (modified: 05 Feb 2023)CoRR 2023Readers: Everyone
Abstract: In this paper, we study the collaborative learning model, which concerns the tradeoff between parallelism and communication overhead in multi-agent reinforcement learning. For a fundamental problem in bandit theory, regret minimization in multi-armed bandits, we present the first and almost tight tradeoffs between the number of rounds of communication between the agents and the regret of the collaborative learning process.
0 Replies

Loading