Keywords: optimization, gradient flow, port-Hamiltonian
TL;DR: We introduce an optimization framework that leverages the port-Hamiltonian dynamical system formalism.
Abstract: In this paper we present a general framework for continuous--time gradient descent, often referred to as gradient flow. We extend Hamiltonian gradient flows, which ascribe mechanical dynamics to neural network parameters and constitute a natural continuous-time alternative to discrete momentum-based gradient descent approaches. The proposed Port-Hamiltonian Gradient Flow (PHGF) casts neural network training into a system--theoretic framework: a fictitious physical system is coupled to the neural network by setting the loss function as an energy term of the system. As autonomous port--Hamiltonian systems naturally tend to dissipate energy towards one of its minima by construction, solving the system simultaneously trains the neural network. We show that general PHGFs are compatible with both continuous-time data--stream optimization, where the optimizer processes a continuous stream of data, as well as standard fixed-step optimization. In continuous-time, PHGFs allow for the embedding of black--box adaptive--step ODE solvers and are able to stick to the energy manifold, thus avoiding divergence due to large learning rates. In fixed-step optimization, on the other hand, PGHFs open the door to novel fixed-step approaches based on symplectic discretizations of the Port--Hamiltonian with similar memory footprint and computational complexity as momentum optimizers.
1 Reply
Loading