Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance

Dun Zeng; Zenglin Xu; Yu Pan; Xu Luo; Qifan Wang; Xiaoying Tang

Enhanced Federated Optimization: Adaptive Unbiased Client Sampling with Reduced Variance

Dun Zeng, Zenglin Xu, Yu Pan, Xu Luo, Qifan Wang, Xiaoying Tang

Published: 22 Jan 2025, Last Modified: 22 Jan 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Federated Learning (FL) is a distributed learning paradigm to train a global model across multiple devices without collecting local data. In FL, a server typically selects a subset of clients for each training round to optimize resource usage. Central to this process is the technique of unbiased client sampling, which ensures a representative selection of clients. Current methods primarily utilize a random sampling procedure which, despite its effectiveness, achieves suboptimal efficiency owing to the loose upper bound caused by the sampling variance. In this work, by adopting an independent sampling procedure, we propose a federated optimization framework focused on adaptive unbiased client sampling, improving the convergence rate via an online variance reduction strategy. In particular, we present the first adaptive client sampler, K-Vib, employing an independent sampling procedure. K-Vib achieves a linear speed-up on the regret bound $\tilde{\mathcal{O}}\big(N^{\frac{1}{3}}T^{\frac{2}{3}}/K^{\frac{4}{3}}\big)$ within a set communication budget $K$. Empirical studies indicate that K-Vib doubles the speed compared to baseline algorithms, demonstrating significant potential in federated optimization.

Submission Length: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=CKQ3sMt4tx

Changes Since Last Submission: Thanks to AC and all the reviewers for their efforts in reviewing our work. Here is the summarization of the changes: - We carefully refine paper writing to make sure theoretical results are connected. Besides, we also revised some concept descriptions for better understanding. And, we also reorganized the experiment section for a better paper-reading experience. - We have refined the convergence analysis of FedAvg with arbitrary client sampling. Moreover, our analysis ensures that the new convergence rate matches the previous work's "optimal client sampling". - Relying on the new convergence analysis, we provide end-to-end convergence guarantees (FedAvg + K-ViB) at the end of Section 5. - We conducted 2 additional natural language processing tasks. Each task involved three different levels of data distribution. Most importantly, new experiments involve large models using the popular transformer and Bert architecture. And, these large models are trained on large-scale datasets AGNews and CCNews, showing the proposed method's ability in real-world applications.

Code: https://github.com/dunzeng/K-Vib

Assigned Action Editor: ~Sebastian_U_Stich1

Submission Number: 3267

Loading