Federated Adapter on Foundation Models: An Out-Of-Distribution Approach

ICLR 2025 Conference Submission1050 Authors

16 Sept 2024 (modified: 23 Nov 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Federated Learning, Foundation Models
Abstract: As foundation models gain increasing attention from both academic and industrial communities, Federated Foundation Models (FedFM) have emerged as a privacy-preserving approach for collaboratively fine-tuning models in federated learning (FL) frameworks using distributed datasets across multiple clients. A key challenge for FedFM, given the versatile nature of foundation models, is addressing out-of-distribution (OOD) generalization, where unseen tasks or clients may exhibit distribution shifts leading to suboptimal performance. Although numerous studies have explored OOD generalization in conventional FL, these methods are inadequate for FedFM due to the challenges posed by large parameter scales and increased data heterogeneity, where large parameter scales would result in high computational and communication costs while increased data heterogeneity like cross-domain would lead to suboptimal performance of the aggregated global model on individual client distributions. To bridge this gap, we propose a new method, called FedOA, to enhance the OOD generalization of FedFM under these conditions. Specifically, our method employs adapter-based parameter-efficient fine-tuning methods for efficient learning, and introduces an additional personalized model with a feature distance-based regularization to ensure distribution alignment and provide OOD generalization guarantees for each client. Theoretically, we demonstrate that the conventional aggregated global model in FedFM inherently retains OOD generalization capabilities, and our proposed method enhances the personalized model's OOD generalization through regularization informed by the global model, with proven convergence under general non-convex settings. Empirically, the effectiveness of the proposed method is validated on benchmark datasets across various NLP tasks.
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1050
Loading