A Novel Convergence Analysis for the Stochastic Proximal Point AlgorithmDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Abstract: In this paper, we study the stochastic proximal point algorithm (SPPA) for general empirical risk minimization (ERM) problems as well as deep learning problems. We present an efficient implementation of SPPA with minor modification for different problem definitions and we observe that efficiently implemented SPPA has faster and more stable convergence than the celebrated stochastic gradient descent (SGD) algorithm, and its many variations, for both convex and non-convex problems. Due to the fact that per-iteration update of SPPA is defined abstractly and has long been considered expensive, its convergence proof has not been well-studied until recently. In this paper, we close the theoretical gap by providing its convergence for convex problems. Our proof technique is different from some of the recent attempts. As a result, we present a surprising result that SPPA for convex problems may converge \emph{arbitrarily fast}, depending on how the step sizes are chosen. As a second contribution, we also show that for some of the canonical ERM problems and deep learning problems, each iteration of SPPA can be efficiently calculated either in closed form or closed to closed form via bisection---the resulting complexity is exactly the same as that of SGD. Real data experiments showcase its effectiveness in terms of convergence compared to SGD and its variants.
10 Replies

Loading