The Gaussian-Head OFL Family: One-Shot Federated Learning from Client Global Statistics

The Gaussian-Head OFL Family: One-Shot Federated Learning from Client Global Statistics

ICLR 2026 Conference Submission19172 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: One-Shot Federated Learning, data-free aggregation, Gaussian Discriminant Heads, Knowledge Distillation

TL;DR: We present GH-OFL, a one-shot federated learning family where clients send only per-class counts and moments; the server builds Gaussian heads achieving data-free SOTA accuracy under strong non-IID.

Abstract: Classical Federated Learning relies on a multi-round iterative process of model exchange and aggregation between server and clients, with high communication costs and privacy risks from repeated model transmissions. In contrast, one-shot federated learning (OFL) alleviates these limitations by reducing communication to a single round, thereby lowering overhead and enhancing practical deployability. Nevertheless, most existing one-shot approaches remain either impractical or constrained, for example, they often depend on the availability of a public dataset, assume homogeneous client models, or require uploading additional data or model information. To overcome these issues, we introduce the Gaussian-Head OFL (GH-OFL) family, a suite of one-shot federated methods that assume class-conditional Gaussianity of pretrained embeddings. Clients transmit only sufficient statistics (per-class counts and first/second-order moments) and the server builds heads via three components: (i) Closed-form Gaussian heads (NB/LDA/QDA) computed directly from the received statistics; (ii) FisherMix, a linear head with cosine margin trained on synthetic samples drawn in an estimated Fisher subspace; and (iii) Proto-Hyper, a lightweight low-rank residual head that refines Gaussian logits via knowledge distillation on those synthetic samples. In our experiments, GH-OFL methods deliver state-of-the-art robustness and accuracy under strong non-IID skew while remaining strictly data-free.

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 19172

Loading