Hybrid Federated Learning for Feature & Sample Heterogeneity: Algorithms and Implementation

Published: 04 May 2024, Last Modified: 16 May 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Federated learning (FL) is a popular distributed machine learning paradigm dealing with distributed and private data sets. Based on the data partition pattern, FL is often categorized into horizontal, vertical, and hybrid settings. All three settings have many applications, but the hybrid FL remains relatively less explored, because it deals with the challenging situation where {\it both} the feature space and the data samples are {\it heterogeneous}. This work designs a novel mathematical model that effectively allows the clients to aggregate distributed data with heterogeneous, and possibly overlapping features and samples. Our main idea is to partition each client's model into a feature extractor part and a classifier part, where the former can be used to process the input data, while the latter is used to perform the learning from the extracted features. The heterogeneous feature aggregation is done through building a server model, which assimilates local classifiers and feature extractors through a carefully designed matching mechanism. A communication-efficient algorithm is then designed to train both the client and server models. Finally, we conducted numerical experiments on multiple image classification data sets to validate the performance of the proposed algorithm. To our knowledge, this is the first formulation and algorithm developed for hybrid FL.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Jeff_Phillips1
Submission Number: 1911