Can Fair Federated Learning reduce the need for personalization?

Alex Iacob; Pedro Porto Buarque de Gusmao; Nicholas Donald Lane

Can Fair Federated Learning reduce the need for personalization?

Alex Iacob, Pedro Porto Buarque de Gusmao, Nicholas Donald Lane

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Federated Learning, Fair Federated Learning, FL, Fair FL, Local Adaptation, Personalization, Machine Learning, ML, Deep Learning, DL, Distributed Machine Learning

TL;DR: This work evaluates Q-Fair Federated Learning as an alternative to personalization, we find that it does not satisfactorily improve local federated model performance and propose an approach based on Knowledge Distillation offering favourable results.

Abstract: Federated Learning (FL) allows edge devices to collaboratively train machine learning models without sharing local data. Since the data distribution varies across client partitions, the performance of the federated model on local data also varies. To solve this, fair FL approaches attempt to reduce the accuracy disparity between local partitions by emphasizing clients with larger losses during training; while local adaptation personalizes the federated model by re-training on local data to provide a device participation incentive in cases where a federated model underperforms relative to one trained locally---their accuracy difference is less than zero. This paper evaluates Q-Fair Federated Learning (Q-FFL) in this relative domain and determines whether it provides a better starting point for personalization or supplants it. Contrary to expectation, Q-FFL does not significantly reduce the number of underperforming clients in a language task while doubling them in an image recognition task. Furthermore, fairness levels which maintain average accuracy provide no benefit to relative accuracy in federated or adapted models. We postulate that Q-FFL is unsuitable for our goal since clients with highly accurate local models require the federated model to have a disproportionate local partition accuracy to receive a benefit. Instead, we propose using knowledge distillation during FL training to create models with a higher local accuracy floor without forfeiting the ceiling. Our preliminary evaluation shows a 50% reduction in underperforming clients in the language task with no increase in underperforming clients for the image task. Thus, we argue that this simple change represents a more promising avenue for reducing the need for personalization than fairness.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Optimization (eg, convex and non-convex optimization)

9 Replies

Loading