Decentralized Accelerated Gradient Methods With Increasing Penalty Parameters

Huan Li, Cong Fang, Wotao Yin, Zhouchen Lin

2020 (modified: 04 Oct 2022)IEEE Trans. Signal Process. 2020Readers: Everyone

Abstract: In this article, we study the communication, and (sub)gradient computation costs in distributed optimization. We present two algorithms based on the framework of the accelerated penalty method with increasing penalty parameters. Our first algorithm is for smooth distributed optimization, and it obtains the V L near optimal O(√L/ϵ(1-σ 2 (W)) log 1/ϵ) communication complexity, VL and the optimal O(√L/ϵ) gradient computation complexity for L-smooth convex problems, where σ 2 (W) denotes the second largest singular value of the weight matrix W associated to the network, and e is the target accuracy. When the problem is μ-strongly convex, and L-smooth, our algorithm has the near optimal O(√L/μ(1-σ 2 (W)) log 2 1/ϵ) complexity for communications, VL and the optimal O(√L/μ log 2 1/ϵ) complexity for gradient computations. Our communication complexities are only worse by a factor of (log 1/ϵ) than the lower bounds. Our second algorithm is designed for nonsmooth distributed optimization, and it achieves both the optimal O(1/ϵ√1-σ 2 (W)) communication complexity, and O(1/ϵ 2 ) subgradient computation complexity, which match the lower bounds for nonsmooth distributed optimization.

0 Replies