Natural Gradient Revisited

Razvan Pascanu, Yoshua Bengio

Jan 17, 2013 (modified: Jan 17, 2013) ICLR 2013 conference submission readers: everyone
  • Decision: conferencePoster-iclr2013-workshop
  • Abstract: The aim of this paper is two-folded. First we intend to show that Hessian-Free optimization (Martens, 2010) and Krylov Subspace Descent (Vinyals and Povey, 2012) can be described as implementations of Natural Gradient Descent due to their use of the extended Gauss-Newton approximation of the Hessian. Secondly we re-derive Natural Gradient from basic principles, contrasting the difference between the two version of the algorithm that are in the literature.