Federated Learning of Large Models at the Edge via Principal Sub-Model Training

Yue Niu; Saurav Prakash; Souvik Kundu; Sunwoo Lee; Salman Avestimehr

Federated Learning of Large Models at the Edge via Principal Sub-Model Training

Yue Niu, Saurav Prakash, Souvik Kundu, Sunwoo Lee, Salman Avestimehr

Published: 21 Oct 2022, Last Modified: 05 May 2023FL-NeurIPS 2022 PosterReaders: Everyone

Keywords: Federated leaning, Resource-constrained clients, Sub-model training

Abstract: Limited compute and communication capabilities of edge users create a significant bottleneck for federated learning (FL) of large models. We consider a realistic, but much less explored, cross-device FL setting in which no client has the capacity to train a full large model nor is willing to share any intermediate activations with the server. To this end, we present Principal Sub-Model (PriSM) training methodology, which leverages models’ low-rank structure and kernel orthogonality to train sub-models in the orthogonal kernel space. More specifically, by applying singular value decomposition (SVD) to original kernels in the server model, PriSM first obtains a set of principal orthogonal kernels in which each one is weighed by its singular value. Thereafter, PriSM utilizes a novel sampling strategy that selects different subsets of the principal kernels independently to create sub-models for clients. Importantly, a kernel with a large singular value is assigned with a high sampling probability. Thus, each sub-model is a low-rank approximation of the full large model, and all clients together achieve the near full-model training. Our extensive evaluations on multiple datasets in resource-constrained settings show that PriSM can yield an improved performance of up to $10\%$ compared to existing alternatives, with only around $20\%$ sub-model training.

Is Student: Yes

4 Replies

Loading