- Keywords: Optimal transport, Information geometry, Nesterov accelerated gradient method
- TL;DR: We study the accelerated gradient flows in the probability space.
- Abstract: We present a systematic framework for the Nesterov's accelerated gradient flows in the spaces of probabilities embedded with information metrics. Here two metrics are considered, including both the Fisher-Rao metric and the Wasserstein-$2$ metric. For the Wasserstein-$2$ metric case, we prove the convergence properties of the accelerated gradient flows, and introduce their formulations in Gaussian families. Furthermore, we propose a practical discrete-time algorithm in particle implementations with an adaptive restart technique. We formulate a novel bandwidth selection method, which learns the Wasserstein-$2$ gradient direction from Brownian-motion samples. Experimental results including Bayesian inference show the strength of the current method compared with the state-of-the-art.
- Code: https://www.dropbox.com/sh/niy9imw8k2gda4l/AADxv_6rbhELbQ8-nhGm7br1a?dl=0
- Original Pdf: pdf