On Sampling Strategies for Spectral Model Sharding

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0
Keywords: federated learning, singular vector decomposition, heterogeneous devices
TL;DR: We propose new sampling strategies and techniques for local training of low-rank factorized neural networks for the scenario of federated learning on heterogeneous devices.
Abstract: The problem of heterogeneous clients in federated learning has recently drawn a lot of attention. Spectral model sharding, i.e., partitioning the model parameters into low-rank matrices based on the singular value decomposition, has been one of the proposed solutions for more efficient on-device training in such settings. In this work we present two sampling strategies for such sharding, obtained as solutions to specific optimization problems. The first produces unbiased estimators of the original weights, while the second aims to minimize the squared approximation error. We discuss how both of these estimators can be incorporated in the federated learning loop and practical considerations that arise during local training. Empirically, we demonstrate that both of these methods can lead to improved performance in various commonly used datasets.
Primary Area: Deep learning architectures
Submission Number: 9833
Loading