FedBCGD: Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning
Abstract: Although federated learning has been widely studied in recent years, there are still high overhead expenses in each communication round for large-scale models such as Vision Transformer. To lower the communication complexity, we propose a novel communication efficient block coordinate gradient descent (FedBCGD) method. The proposed method splits model parameters into several blocks and enables upload a specific parameter block by each client during training, which can significantly reduce communication overhead. Moreover, we also develop an accelerated FedBCGD algorithm (called FedBCGD+) with client drift control and stochastic variance reduction techniques. To the best of our knowledge, this paper is the first parameter block communication work for training large-scale deep models. We also provide the convergence analysis for the proposed algorithms. Our theoretical results show that the communication complexities of our algorithms are a factor $1/N$ lower than those of existing methods, where $N$ is the number of parameter blocks, and they enjoy much faster convergence results than their counterparts. Empirical results indicate the superiority of the proposed algorithms compared to state-of-the-art algorithms.
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Content] Vision and Language
Relevance To Conference: Our paper presents an innovative approach to federated learning, specifically focusing on the application of block gradient descent in distributed settings. This novel algorithm, termed " Federated Block Coordinate Gradient Descent (FedBCGD)," addresses crucial challenges in communication efficiency and convergence speed, which are highly pertinent to the multimedia domain.
Here's how our paper aligns with the conference's scope and themes:Multimedia Data Processing: FedBCGD offers a promising solution for efficient and scalable processing of multimedia data distributed across multiple devices. By optimizing communication overhead and convergence speed, our algorithm enhances the feasibility of multimedia applications involving large-scale data sets.
Distributed Learning: The conference emphasizes advancements in distributed learning techniques for multimedia analysis. Our paper contributes to this theme by introducing a novel federated learning algorithm tailored for multimedia data.Algorithmic Innovations: MM seeks contributions that introduce innovative algorithms and methodologies for multimedia analysis. Our paper presents a novel approach that partitions model parameters into blocks and updates them independently across clients, leading to improved communication efficiency and convergence speed.Overall, our paper offers a significant contribution to the multimedia research community by addressing fundamental challenges in federated learning and providing practical solutions for efficient multimedia data processing and analysis.
Supplementary Material: zip
Submission Number: 2536
Loading