FEDEMOE: IMPROVING PERSONALIZATION ON HET- EROGENEOUS FEDERATED LEARNING VIA ELASTIC MIXTURE OF EXPERTS ARCHITECTURE

Haizhou Du; Lixin Huang; Zonghan Wu; Nitin Bisht; Huan Huo; Xiufeng Liu

FEDEMOE: IMPROVING PERSONALIZATION ON HET- EROGENEOUS FEDERATED LEARNING VIA ELASTIC MIXTURE OF EXPERTS ARCHITECTURE

Haizhou Du, Lixin Huang, Zonghan Wu, Nitin Bisht, Huan Huo, Xiufeng Liu

09 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Heterogeneous federated learning, elastic mixture of experts, personalization, knowledge transfer, statistical heterogeneous

Abstract: Heterogeneous federated learning (HtFL) has emerged as a promising approach to address heterogeneity in local computational resources and data distribution, as is common in the real world. However, existing methods cause performance degradation of model personalization because personalized and generalized knowledge are either intertwined or dominated by one of them. To address this issue, we propose a novel Elastic Mixture of Experts (EMoE) architecture on HtFL, namely FedEMoE, decoupling personalization from generalization. In detail, FedEMoE employs a multi-scale feature extraction mechanism via personalized experts to enrich personalized knowledge. Furthermore, we design an elastic shared expert to break the transferred knowledge bottleneck across heterogeneous client models. The elastic shared expert can adaptively expand or shrink according to the status of each expert by the weight spectrum analysis, respectively. Moreover, the sparsity of mixture of experts (MoE) alleviates the loss of personalized knowledge that typically results from dense model aggregation. Extensive experiments across statistical and model heterogeneity settings demonstrate that FedEMoE significantly outperforms state-of-the-art federated learning methods on the performance of each heterogeneous model over diverse datasets.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 3413

Loading