FedSMU: Communication-Efficient and Generalization-Enhanced Federated Learning through Symbolic Model Updates

Xinyi Lu; Hao Zhang; Chenglin Li; Weijia Lu; ZHIFEI YANG; Wenrui Dai; xiaodong Zhang; Xiaofeng Ma; Can Zhang; Junni Zou; Hongkai Xiong

FedSMU: Communication-Efficient and Generalization-Enhanced Federated Learning through Symbolic Model Updates

Xinyi Lu, Hao Zhang, Chenglin Li, Weijia Lu, ZHIFEI YANG, Wenrui Dai, xiaodong Zhang, Xiaofeng Ma, Can Zhang, Junni Zou, Hongkai Xiong

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Federated learning, Efficient Communication, Enhanced Generalization

Abstract: The significant communication overhead and client data heterogeneity have posed important challenges to current federated learning (FL) paradigm. Most compression-based and optimization-based FL algorithms typically focus on addressing either the model compression challenge or the data heterogeneity issue individually, rather than tackling both of them. In this paper, we observe that by symbolizing the client model updates to be uploaded (i.e., normalizing the magnitude for each model parameter at local clients), the model heterogeneity can be mitigated that is essentially stemmed from data heterogeneity, thereby helping improve the overall generalization performance of the globally aggregated model at the server. Inspired with this observation, and further motivated by the success of Lion optimizer in achieving the optimal performance on most tasks in centralized learning, we propose a new FL algorithm, called FedSMU, which simultaneously reduces the communication overhead and alleviates the data heterogeneity issue. Specifically, FedSMU splits the standard Lion optimizer into the local updates and global execution, where only the symbol of client model updates commutes between the client and server. We theoretically prove the convergence of FedSMU for the general non-convex settings. Through extensive experimental evaluations on several benchmark datasets, we demonstrate that our FedSMU algorithm not only reduces the communication overhead, but also achieves a better generalization performance than the other compression-based and optimization-based baselines.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9592

Loading