\section{Conclusion}
This work focuses on the study of FMAB problems and introduces a fully distributed algorithm called \texttt{DRRB-bandit}. To address the challenge of heterogeneous feedback, we propose a consensus estimation subroutine that allows agents to estimate the global mean of each arm by only communicating with their neighbors. This approach improves convergence speed compared to previous methods by effectively balancing the contribution of each agent's data.
According to the works above, the proposed algorithm reduces individual and group upper regrets. Additionally, we discuss the lower bounds for the heterogeneous federated bandit problem, proving that our algorithm achieves near-optimal performance among those using Round-Robin sampling.


\section{Acknowledgment}
We thank our colleague Mengfan Xu, as well as our anonymous reviewers, for their valuable feedback. This work was supported by NSFC (no. 62306138), JiangsuNSF (no. BK20230784), and the Innovation Program of State Key Laboratory for Novel Software Technology at Nanjing University (no. ZZKT2025B25).