FedIN: Federated Intermediate Layers Learning for Model Heterogeneity

FedIN: Federated Intermediate Layers Learning for Model Heterogeneity

TMLR Paper4522 Authors

20 Mar 2025 (modified: 09 Jun 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Federated learning (FL) facilitates edge devices to cooperatively train a global shared model while maintaining the training data locally and privately. However, a prevalent yet impractical assumption in FL requires the participating edge devices to train on an identical global model architecture. Recent research endeavors to address this problem in FL using public datasets. Nevertheless, acquiring data distributions that closely match to those of participating users poses a significant challenge. In this study, we propose an FL method called Federated Intermediate Layers Learning (FedIN), which supports heterogeneous models without relying on any public datasets. Instead, FedIN leverages the inherent knowledge embedded in client model features to facilitate knowledge exchange. To harness the knowledge from client features, we propose Intermediate Layers (IN) training to align intermediate layers based on features obtained from other clients. IN training only needs minimal memory and communication overhead by employing a single batch of client features. Additionally, we formulate and resolve a convex optimization problem to mitigate the challenge of gradient divergence stemming from model heterogeneity. The experimental results demonstrate the superior performance of FedIN in heterogeneous model settings compared to state-of-the-art algorithms. Furthermore, the experiments discuss the details of how to protect user privacy leaked from IN features, and our ablation study illustrates the effectiveness of IN training.

Submission Length: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=dv5WKcgpbj

Changes Since Last Submission: We sincerely thank all the reviewers and the AE for their constructive and valuable comments, which have greatly helped us in making substantial improvements to our paper. Below, we outline the key changes made since the last TMLR submission: 1. **Introduction**: We have condensed the descriptions regarding the motivation for using model heterogeneity to address system heterogeneity. 2. **Figure 2**: The figure has been revised to clearly differentiate between 'local training' and 'IN training' and to provide a detailed illustration of one round of the FedIN process. 3. **Section 4.1 and Figure 2**: We have elaborated on the construction of $(s_{in}, s_{out})$ to enhance clarity and understanding.. 4. **Section 4.5**: We have discussed the limitations of FedIN and proposed two methods to address these limitations. Moreover, we conducted experiments in Section 5.2 to evaluate the feasibility of one proposed method. 5. **Personalized FL Baselines**: We have added 5 baselines, including FedGen, FedET, and personalized FL baselines from the suggestions of the AE, FedFomo, FedDPA, and FedSelect, as shown in Table 1, bringing the total to 15 baselines. Additionally, Table 1 has been updated to include the standard deviation for all datasets with Non-IID settings, based on results from three random seeds. 6. **Additional Datasets and Model Architectures**: Following the comments from reviewers and the AE, we have included experiments on an additional dataset, CINIC-10. We have also conducted experiments using Vision Transformers (ViTs) on four datasets with two different distributions, with detailed results provided in Appendix D.3. 7. **Figure 10(a)**: We have replaced the reconstruction loss with image similarities using LPIPS to evaluate the quality of reconstructed images more informatively. 8. **Appendix A**: We have illustrated the details of the derivation process for Section 4.2. 9. **Appendix B**: We have formulated the details of two aggregation methods and provided two figures to demonstrate their differences. 10. **Appendix C**: We have elaborated on the differential privacy analysis for the privacy-preserving method proposed in Section 4.3. 11. **Appendix D**: We have shown more details of our experiments, including the model architectures of FedIN, the details of baselines, and the experiment results of ViTs. We combine the main context and Appendix as one PDF for convenience. At last, we are grateful for the time and effort put in by the reviewers and the AE in evaluating this work. Sincerely, The authors

Assigned Action Editor: ~Tian_Li1

Submission Number: 4522

Loading