From Mismatch to Harmony: Resolving Feature-Classifier Mismatch in Federated Learning via Prompt-Driven Feature Transformation
Keywords: Personalized Federated Learning, Data Heterogeneity, Feature-Classifier Mismatch, Prompt-Driven Feature Transformation
Abstract: In conventional Federated Learning approaches like FedAvg, training a global model becomes challenging in the presence of data heterogeneity. To address this, Personalized Federated Learning (PFL) has emerged as a leading solution, enabling clients to train personalized models that are tailored to local data distributions. Surprisingly, our linear probe experiments reveal that FedAvg’s feature extractor outperforms most PFL methods on local client data. Even more intriguingly, applying a simple linear transformation to align local features from FedAvg’s extractor with the classifier enables FedAvg to surpass most PFL methods. These findings suggest that in data heterogeneity scenarios, FedAvg’s weaker performance is not due to inadequate global model training but rather a mismatch between local features and the classifier. This observation motivates us to develop a new framework to address this mismatch problem. A straightforward solution would be to insert the personalized linear transformation layer mentioned above between the global feature extractor and the global classifier. However, this approach can easily overfit the limited local training data due to the large number of personalized parameters, and it is insufficient for handling complex datasets. In this paper, we introduce FedPFT, which leverages personalized prompts to resolve the mismatch problem. These prompts, along with local features, are fed into a shared self-attention-based module, where features are transformed via the attention mechanism to align with the global classifier. These prompts consist of minimal trainable parameters, reducing the risk of overfitting to local data. Additionally, this prompt-driven approach offers strong flexibility, allowing for task-specific prompts to integrate additional training objectives (e.g., contrastive learning) to further enhance performance. Our experiments demonstrate that FedPFT outperforms state-of-the-art methods by up to 5.07\%, with additional improvements of up to 7.08\% when collaborative contrastive learning is introduced.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4249
Loading