Tackling Feature-Classifier Mismatch in Federated Learning via Prompt-Driven Feature Transformation

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Personalized Federated Learning, Feature-Classifier Mismatch, Prompt-Driven Feature Transformation
Abstract: Federated Learning (FL) faces challenges due to data heterogeneity, which limits the global model’s performance across diverse client distributions. Personalized Federated Learning (PFL) addresses this by enabling each client to process an individual model adapted to its local distribution. Many existing methods assume that certain global model parameters are difficult to train effectively in a collaborative manner under heterogeneous data. Consequently, they localize or fine-tune these parameters to obtain personalized models. In this paper, we reveal that both the feature extractor and classifier of the global model are inherently strong, and the primary cause of its suboptimal performance is the mismatch between local features and the global classifier. Although existing methods alleviate this mismatch to some extent and improve performance, we find that they either (1) fail to fully resolve the mismatch while degrading the feature extractor, or (2) address the mismatch only post-training, allowing it to persist during training. This increases inter-client gradient divergence, hinders model aggregation, and ultimately leaves the feature extractor suboptimal for client data. To address this issue, we propose FedPFT, a novel framework that resolves the mismatch during training using personalized prompts. These prompts, along with local features, are processed by a shared self-attention-based transformation module, ensuring alignment with the global classifier. Additionally, this prompt-driven approach offers strong flexibility, enabling task-specific prompts to incorporate additional training objectives (\eg, contrastive learning) to further enhance the feature extractor. Extensive experiments show that FedPFT outperforms state-of-the-art methods by up to 5.07%, with further gains of up to 7.08% when collaborative contrastive learning is incorporated.
Supplementary Material: zip
Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)
Submission Number: 18716
Loading