Keywords: Optimization, Machine Learning
Abstract: We investigate a family of Multi-Step Proximal Point Methods, the Backwards Differentiation For-
mulas, which are inspired by implicit linear discretization of gradient flow. The resulting meth-
ods are multi-step proximal point methods, with similar computational cost in each update as the
proximal point method. We explore several optimization methods where applying an approximate
multistep proximal points method results in improved convergence behavior. We argue that this is
the result of the lowering of truncation error in approximating gradient flow.
Submission Number: 79
Loading