Gradient Descent and the Power Method: Exploiting their connection to find the leftmost eigen-pair and escape saddle points

Gradient Descent and the Power Method: Exploiting their connection to find the leftmost eigen-pair and escape saddle points

TMLR Paper3314 Authors

09 Sept 2024 (modified: 09 May 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Applying Gradient Descent with fixed Momentum (GDM) and a fixed step size to minimize a (possibly nonconvex) quadratic function is equivalent to running the Power Method with fixed Momentum (PMM) on the gradients. Thus, valuable eigen-information is available via GDM. A new algorithm called Gradient Descent with a Kick (GD-Kick) is presented, which exploits the `free' eigen-information available from the GDM-PMM connection, and occasionally takes a locally adaptive, long step. Numerical experiments show the advantages of GD-Kick compared with vanilla GD, particularly near saddle points.

Submission Type: Long submission (more than 12 pages of main content)

Assigned Action Editor: ~Ahmet_Alacaoglu2

Submission Number: 3314

Loading