Log-Scale Quantization in Distributed First-Order Methods: Gradient-Based Learning From Distributed Data
Abstract: Decentralized strategies are of interest for learning from large-scale data over networks. This paper studies learning over a network of geographically distributed nodes/agents subject to quantization. Each node possesses a private local cost function, collectively contributing to a global cost function, which the considered methodology aims to minimize. In contrast to many existing papers, the information exchange among nodes is log-quantized to address limited network-bandwidth in practical situations. We consider a first-order computationally efficient distributed optimization algorithm (with no extra inner consensus loop) that leverages node-level gradient correction based on local data and network-level gradient aggregation only over nearby nodes. This method only requires balanced networks with no need for stochastic weight design. It can handle log-scale quantized data exchange over possibly time-varying and switching network setups. We study convergence over both structured networks (for example, training over data-centers) and ad-hoc multi-agent networks (for example, training over dynamic robotic networks). Through experimental validation, we show that (i) structured networks generally result in a smaller optimality gap, and (ii) log-scale quantization leads to a smaller optimality gap compared to uniform quantization. Note to Practitioners—Motivated by recent developments in cloud computing, parallel processing, and the availability of low-cost CPUs and communication networks, this paper considers distributed and decentralized algorithms for machine learning and optimization. These algorithms are particularly relevant for decentralized data mining, where data sets are distributed across a network of computing nodes. A practical example of this is the classification of images over a networked data centre. In real-world scenarios, practical model nonlinearities such as data quantization must be addressed for information exchange among the computing nodes. This work emphasizes the importance of handling log-scale quantization and compares its performance over uniform quantization. By exploring these quantization methods, we aim to determine which is more accurate in terms of optimality gap and learning residual. Moreover, we study the impact of the structure of the information-sharing network on reducing the optimality gap and improving the convergence rate of distributed algorithms. As contemporary distributed and networked data mining systems demand highly accurate algorithms with fast convergence for real-time applications, our research emphasizes the benefit of structured networks under logarithmic quantization information-exchange. Our findings can be extended to different machine learning algorithms, offering pathways to more accurate and faster data mining solutions.
External IDs:dblp:journals/tase/DoostmohammadianQKRK25
Loading