Gradient Coding With Iterative Block Leverage Score Sampling

Neophytos Charalambides, Mert Pilanci, Alfred O. Hero III

Published: 2024, Last Modified: 05 Feb 2025IEEE Trans. Inf. Theory 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Gradient coding is a method for mitigating straggling servers in a centralized computing network that uses erasure-coding techniques to distributively carry out first-order optimization methods. Randomized numerical linear algebra uses randomization to develop improved algorithms for large-scale linear algebra computations. In this paper, we propose a method for distributed optimization that combines gradient coding and randomized numerical linear algebra. The proposed method uses a randomized $\ell _{2}$ -subspace embedding and a gradient coding technique to distribute blocks of data to the computational nodes of a centralized network, and at each iteration the central server only requires a small number of computations to obtain the steepest descent update. The novelty of our approach is that the data is replicated according to importance scores, called block leverage scores, in contrast to most gradient coding approaches that uniformly replicate the data blocks. Furthermore, we do not require a decoding step at each iteration, avoiding a bottleneck in previous gradient coding schemes. We show that our approach results in a valid $\ell _{2}$ -subspace embedding, and that our resulting approximation converges to the optimal solution.