Harnessing the Power of Federated Learning in Federated Contextual Bandits

TMLR Paper1974 Authors

22 Dec 2023 (modified: 12 Apr 2024)Under review for TMLREveryoneRevisionsBibTeX
Abstract: Federated learning (FL) has demonstrated great potential in revolutionizing distributed machine learning, and tremendous efforts have been made to extend it beyond the original focus on supervised learning. Among many directions, federated contextual bandits (FCB), a pivotal integration of FL and sequential decision-making, has garnered significant attention in recent years. Despite substantial progress, existing FCB approaches have largely employed their tailored FL components, often deviating from the canonical FL framework. Consequently, even renowned algorithms like FedAvg remain under-utilized in FCB, let alone other FL advancements. Motivated by this disconnection, this work takes one step towards building a tighter relationship between the canonical FL study and the investigations on FCB. In particular, a novel FCB design, termed FedIGW, is proposed to leverage a regression-based CB algorithm, i.e., inverse gap weighting. Compared with existing FCB approaches, the proposed FedIGW design can better harness the entire spectrum of FL innovations, which is concretely reflected as (1) flexible incorporation of (both existing and forthcoming) FL protocols; (2) modularized plug-in of FL analyses in performance guarantees; (3) seamless integration of FL appendages (such as personalization, robustness, and privacy). We substantiate these claims through rigorous theoretical analyses and empirical evaluations.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: The changes suggested by the reviewers have all been incorporated in the revised draft, which are highlighted in red. Major modifications are listed in the following. - Additional discussions on the related work [R1] are provided at the end of Section 1 to highlight the contributions of this work; - The connections between Assumption 3.1 and canonical FL studies are further clarified at the end of Section 3.1; - Remark 4.6 is added to discuss the analyses beyond linear reward functions; - Figure 3 and Section5 are updated to incorporate the suggested baselines (i.e., softmax and greedy) and the further fine-tuned FN-UCB.
Assigned Action Editor: ~Peter_Richtarik1
Submission Number: 1974
Loading