FOCUS: Fairness via Agent-Awareness for Federated Learning on Heterogeneous Data

Wenda Chu; Chulin Xie; Boxin Wang; Linyi Li; Lang Yin; Arash Nourian; Han Zhao; Bo Li

FOCUS: Fairness via Agent-Awareness for Federated Learning on Heterogeneous Data

Wenda Chu, Chulin Xie, Boxin Wang, Linyi Li, Lang Yin, Arash Nourian, Han Zhao, Bo Li

Published: 28 Oct 2023, Last Modified: 14 Dec 2023FL@FM-NeurIPS’23 OralEveryoneRevisionsBibTeX

Student Author Indication: Yes

Keywords: federated learning, fairness, data heterogeneity, clustering, expectation-maximization (EM)

TL;DR: We propose a formal definition of fairness via agent-awareness for FL (FAA) on heterogeneous data and a fair FL training algorithm based on agent clustering (FOCUS) to achieve FAA.

Abstract: Federated learning (FL) allows agents to jointly train a global model without sharing their local data to protect the privacy of local agents. However, due to the heterogeneous nature of local data, existing definitions of fairness in the context of FL are prone to noisy agents in the network. For instance, existing work usually considers accuracy parity as the fairness metric for different agents, which is not robust under the heterogeneous setting, since it will enforce agents with high-quality data to achieve similar accuracy to those who contribute low-quality data and may discourage the agents with high-quality data from participating in FL. In this work, we propose a formal FL fairness definition, fairness via agent-awareness (FAA), which takes the heterogeneity of different agents into account by measuring the data quality with approximated Bayes optimal error. Under FAA, the performance of agents with high-quality data will not be sacrificed just due to the existence of large numbers of agents with low-quality data. In addition, we propose a fair FL training algorithm leveraging agent clustering (FOCUS) to achieve fairness in FL, as measured by FAA and other fairness metrics. Theoretically, we prove the convergence and optimality of FOCUS under mild conditions for both linear and general convex loss functions with bounded smoothness. We also prove that FOCUS always achieves higher fairness in terms of FAA compared with standard FedAvg under both linear and general convex loss functions. Empirically, we show that on four FL datasets, including synthetic data, images, and texts, FOCUS achieves significantly higher fairness in terms of FAA and other fairness metrics, while maintaining competitive prediction accuracy compared with FedAvg and four state-of-the-art fair FL algorithms.

Submission Number: 21

Loading