Federated Multi-armed Bandits with Efficient Bit-Level Communications

Haoran Zhang; Yang Xu; Xuchuang Wang; Hao-Xu Chen; Hao Qiu; Lin Yang; Yang Gao

Federated Multi-armed Bandits with Efficient Bit-Level Communications

Haoran Zhang, Yang Xu, Xuchuang Wang, Hao-Xu Chen, Hao Qiu, Lin Yang, Yang Gao

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: federated multi-armed bandit, communication-efficient algorithm

Abstract: In this work, we study the federated multi-armed bandit (FMAB) problem, where a set of distributed agents collaboratively aim to minimize cumulative regret while interacting with a shared set of arms. Unlike traditional centralized bandit models, agents in FMAB settings are connected via a communication graph and cannot share data freely due to bandwidth limitations or privacy constraints. This raises a fundamental challenge: how to achieve optimal learning performance under stringent communication budgets. We propose a novel communication-efficient algorithm that decouples the learning process into two phases: one for eliminating suboptimal arms through early and frequent communication of key decisions, and another for refining global estimates using buffered, quantized, and differentially transmitted statistics. By carefully balancing the communication frequency and precision of shared information, our algorithm achieves the optimal individual regret bound $O(N^{-1}\log T)$ while significantly reducing the total number of communication rounds and transmitted bits. Theoretically, we derive tight upper bounds on both individual cumulative regret and group regret, and prove that our method asymptotically matches the lower bound of regret in federated settings. Experimental results on synthetic data validate the effectiveness of the proposed approach in various graph topologies and under heterogeneous feedback.

Supplementary Material: zip

Primary Area: Reinforcement learning (e.g., decision and control, planning, hierarchical RL, robotics)

Submission Number: 12670

Loading