A Computation and Communication Efficient Projection-free Algorithm for Decentralized Constrained Optimization
Decentralized constrained optimization problems arise in numerous real-world applications, where a major challenge lies in the computational complexity of projecting onto complex sets, especially in large-scale systems. The projection-free method, Frank-Wolfe (FW), is popular for the constrained optimization problem with complex sets due to its efficiency in tackling the projection process. However, when applying FW methods to decentralized constrained finite-sum optimization problems, previous studies provide suboptimal incremental first-order oracle (IFO) bounds in both convex and non-convex settings. In this paper, we propose a stochastic algorithm named Decentralized Variance Reduction Gradient Tracking Frank-Wolfe ($\texttt{DVRGTFW}$), which incorporates the techniques of variance reduction, gradient tracking, and multi-consensus in the FW update to obtain tight bounds. We present a novel convergence analysis, diverging from previous decentralized FW methods, and demonstrating $\tilde{\mathcal{O}}(n+\sqrt{\frac{n}{m}}L\varepsilon^{-1})$ and $\mathcal{O}(\sqrt{\frac{n}{m}}L^2\varepsilon^{-2})$ IFO complexity bounds in convex and non-convex settings, respectively. To the best of our knowledge, these bounds are the best achieved in the literature to date. Besides, in the non-convex case, $\texttt{DVRGTFW}$ achieves $\mathcal{O}(\frac{L^2\varepsilon^{-2}}{\sqrt{1-\lambda_2(W)}})$ communication complexity which is closed to the lower bound $\Omega(\frac{L\varepsilon^{-2}}{\sqrt{1-\lambda_2(W)}})$. Empirical results validate the convergence properties of $\texttt{DVRGTFW}$ and highlight its superior performance over other related methods.