Keywords: federated learning, efficient training, dropout, stochastic model, channel selection
Abstract: Federated learning (FL) enables edge clients to train collaboratively while preserving individual's data privacy. As clients do not inherently share identical data distributions, they may disagree in the direction of parameter updates, resulting in high compute and communication costs in comparison to centralized learning. Recent advances in FL focus on reducing data transmission during training; yet they neglected the increase of computational cost that dwarfs the merit of reduced communication. To this end, we propose FedDrop, which introduces channel-wise weighted dropout layers between convolutions to accelerate training while minimizing their impact on convergence. Empirical results show that FedDrop can drastically reduce the amount of FLOPs required for training with a small increase in communication, and push the Pareto frontier of communication/computation trade-off further than competing FL algorithms.
One-sentence Summary: FedDrop adjusts channel dropout probabilities to concentrate the clients' training effort on neurons that they specialize well, while sparsifying models for improved communication/compute trade-offs.
14 Replies
Loading