Revisiting System-Heterogeneous Federated Learning through Dynamic Model Search

TMLR Paper2685 Authors

14 May 2024 (modified: 04 Jul 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Federated learning is a distributed learning paradigm in which multiple {\color{blue} distributed} clients train a global model while keeping data local. These clients can have various available memory and network bandwidth. However, to achieve the best global model performance, how we can utilize available memory and network bandwidth to the maximum remains an open challenge. In this paper, we propose to assign each client a subset of the global model, having different layers and channels on each layer. To realize that, we design a constrained model search process with early stop to improve efficiency of finding the models from such a very large space; and a data-free knowledge distillation mechanism to improve the global model performance when aggregating models of such different structures. For fair and reproducible comparisons between different solutions, we directly allocate different memory and bandwidth to each client according to memory and bandwidth logs collected on devices. The evaluation shows that our solution can have accuracy increase ranging from 2.43\% to 15.81\% and provide 5\% to 40\% more memory and bandwidth utilization with negligible extra running time, comparing to existing state-of-the-art system-heterogeneous federated learning methods under different available memory and bandwidth, non-i.i.d.~datasets, image and text tasks.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=Pg16UFZYMp
Changes Since Last Submission: The paper is formatted according to TMLR's stylefile format. The page header contains "Under review as submission to TMLR".
Assigned Action Editor: ~Yiming_Ying1
Submission Number: 2685
Loading