Keywords: Federated Learning, Partial Network Updates, Convergence Efficiency, Computational and Communicational Overhead Reduction
TL;DR: We observe the layer mismatch problem in federated learning and propose a partial network update method to address it, improving convergence speed, accuracy, and reducing computational and communication overhead.
Abstract: Federated learning is a distributed machine learning paradigm designed to protect user data privacy, which has been successfully implemented across various scenarios. In traditional federated learning, the entire parameter set of local models is updated and averaged in each training round. Although this full network update method maximizes knowledge acquisition and sharing for each model layer, it prevents the layers of the global model from cooperating effectively to complete the tasks of each client, a challenge we refer to as layer mismatch. This mismatch problem recurs after every parameter averaging, consequently slowing down model convergence and degrading overall performance. To address the layer mismatch issue, we introduce the FedPart method, which restricts model updates to either a single layer or a few layers during each communication round. Furthermore, to maintain the efficiency of knowledge acquisition and sharing, we develop several strategies to select trainable layers in each round, including sequential updating and multi-round cycle training. Through both theoretical analysis and experiments, our findings demonstrate that the FedPart method significantly surpasses conventional full network update strategies in terms of convergence speed and accuracy, while also reducing communication and computational overheads.
Supplementary Material: zip
Primary Area: Other (please use sparingly, only use the keyword field for more details)
Submission Number: 6149
Loading