Out-of-Distribution Federated Distillation with Domain-Aware Proxy Selection

Out-of-Distribution Federated Distillation with Domain-Aware Proxy Selection

ACL ARR 2026 January Submission5920 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Federated Distillation, Natural Language Processing, Out-of-Distribution

Abstract: Federated Learning is a distributed machine learning paradigm that trains a global model by aggregating local clients without sharing private data of each client. Federated Distillation (FD) builds upon this paradigm by leveraging knowledge distillation to exchange soft predictions on proxy data instead of model parameters, enabling more efficient communication and supporting heterogeneous model collaboration. However, FD models trained on In-Distribution data are hardly adapted to Out-of-Distribution (OOD) scenarios. In this paper, we propose a domain-aware proxy selection framework to better adopt proxy data for OOD problems. The experimental results show that the proposed models effectively address the challenges of distribution shifts under OOD with and without proxy data by achieving average 82.9\% and 81.0\% over existing works on standard benchmarks. The codes and data are released in https://anonymous.4open.science/r/DPS-FD-8596/.

Paper Type: Long

Research Area: Low-resource Methods for NLP

Research Area Keywords: distillation,data-efficient training,NLP in resource-constrained settings

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 5920

Loading