Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data

Yan Li; Caleb Ju; Ethan Fang; Tuo Zhao

Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data

Yan Li, Caleb Ju, Ethan Fang, Tuo Zhao

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Abstract: Bregman proximal point algorithm (BPPA), as one of the centerpieces in the optimization toolbox, has been witnessing emerging applications. With a simple and easy-to-implement update rule, the algorithm bears several compelling intuitions for empirical successes, yet rigorous justifications are still largely unexplored. We study the computational properties of BPPA through classification tasks with separable data, and demonstrate provable algorithmic regularization effects associated with BPPA. We show that BPPA attains a non-trivial margin, which closely depends on the condition number of the distance-generating function inducing the Bregman divergence. We further demonstrate that the dependence on the condition number is tight for a class of problems, thus showing the importance of divergence in affecting the quality of the obtained solutions. In addition, we extend our findings to mirror descent (MD), for which we establish similar connections between the margin and Bregman divergence. We demonstrate through a concrete example, and show BPPA/MD converges in direction to the maximal margin solution with respect to the squared Mahalanobis distance. Our theoretical findings are among the first to demonstrate the benign learning properties of BPPA/MD, and also provide strong corroborations for a careful choice of divergence in the algorithmic design.

Supplementary Material: zip

20 Replies

Loading