Keywords: recurrent neral networks, non-convex optimization, parametrization, convergence analysis, loss landscape analysis
TL;DR: We investigate the properties of loss landscapes under canonical and modal parametrizations of recurrent neural networks
Abstract: The characteristics of the loss landscape are vital for ensuring efficient gradient-based optimization of recurrent neural networks (RNNs).
Learning dynamics in continuous-time RNNs are prone to plateauing effects, with recent studies focusing on this issue by analyzing loss landscapes, particularly in the setting of linear time-invariant (LTI) systems. Building on this work, we explore a fairly simplified setting and study the loss landscape under modal and canonical parametrizations, derived from their respective state-space realizations. We find that canonical parametrization offers improved quasi-convexity properties and faster learning compared to modal forms.
Theoretical results are corroborated by numerical experiments. We also show that autonomous ReLU-based RNNs in a modal structure generate trajectories which can be produced by an LTI systems while those with a canonical structure produce more complex trajectories beyond the scope of LTI systems.
Submission Number: 65
Loading