Abstract: The natural gradient is a powerful method to improve the transient dynamics of learning by considering the geometric structure of the parameter space. Many natural gradient methods have been developed with regards to Kullback-Leibler (KL) divergence and its Fisher metric, but the framework of natural gradient can be essentially extended to other divergences. In this study, we focus on score matching, which is an alternative to maximum likelihood learning for unnormalized statistical models, and introduce its Riemannian metric. By using the score matching metric, we derive an adaptive natural gradient algorithm that does not require computationally demanding inversion of the metric. Experimental results in a multi-layer neural network model demonstrate that the proposed method avoids the plateau phenomenon and accelerates the convergence of learning compared to the conventional stochastic gradient descent method.
Conflicts: u-tokyo.ac.jp, riken.jp
3 Replies
Loading