Accuracy at Lower Cost: Rethinking Client Selection in Federated Learning

Accuracy at Lower Cost: Rethinking Client Selection in Federated Learning

ICLR 2026 Conference Submission24846 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Federated Learning, Client Selection Problem

TL;DR: Optimal Client Selection for Federated Learning by balancing Accuracy and Communication Cost

Abstract: Federated learning (FL) enables collaborative model training across multiple clients without sharing raw data, thereby ensuring privacy. A critical performance factor for FL is client selection. Under independent and identically distributed (IID) data, clients are chosen at random, which can lead to reduced accuracy, slower convergence, and higher communication cost. In this work, we present a systematic empirical study of client selection, revealing that random participation can significantly degrade performance. Motivated by these findings, we introduce a multi-objective optimization strategy that jointly balances model accuracy and communication cost under IID partitioning. For fast evaluation, we propose a dataset complexity-aware surrogate regressor that predicts the FL outcomes (e.g., accuracy or loss) for image classification tasks, thereby avoiding costly full model training. Using the predicted client configuration (number of selected and available clients) resulting from multi-objective optimization on a new dataset, and without requiring any additional training, our framework achieves 98.9\% of the maximum attainable accuracy while incurring only 38.75\% of the maximum communication cost. Moreover, it identifies a diminishing‑returns regime that preserves 99.9\% of peak accuracy while reducing cost to 63.12\%. These results demonstrate that both the performance and variance of FL can be estimated solely by dataset complexity and client dataset size, enabling the identification of client configurations that best balance accuracy and communication costs.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 24846

Loading