Keywords: Markov Chain Monte Carlo, Nesterov Accelerated Gradient, accelerated sampling
Abstract: It is known (Shi et al., 2021) that Nesterov's Accelerated Gradient (NAG) for optimization starts to differ from its continuous time limit (noiseless kinetic Langevin) when its stepsize becomes finite. This work explores the sampling counterpart of this phenonemon and proposes an accelerated-gradient-based MCMC method, based on the optimizer of NAG for strongly convex functions (NAG-SC): we reformulate NAG-SC as a Hessian-Free High-Resolution ODE, change its high-resolution coefficient to a hyperparameter, inject appropriate noise, and discretize the resulting diffusion process. Accelerated sampling enabled by the new hyperparameter is quantified and it is not a false acceleration created by time-rescaling. At continuous-time level, additional acceleration over underdamped Langevin in $W_2$ distance is proved. At discrete algorithm level, a dedicated discretization is proposed to simulate the Hessian-Free High-Resolution SDE in a cost-efficient manner. For log-strong-concave-and-smooth target measures, the proposed algorithm achieves $\tilde{\mathcal{O}}(\sqrt{d}/\epsilon)$ iteration complexity in $W_2$ distance, same as underdamped Langevin dynamics, but with a reduced constant. Empirical experiments are conducted to numerically verify our theoretical results.
One-sentence Summary: Like the finite stepsize difference between NAG for optimization and its continuous time limit, sampler based on kinetic Langevin can also be further accelerated.
Supplementary Material: zip
37 Replies
Loading