Role of Parametrization in Learning Dynamics of Recurrent Neural Networks

Adwait Datar; Chinmay Datar; Zahra Monfared; Felix Dietrich

Role of Parametrization in Learning Dynamics of Recurrent Neural Networks

Adwait Datar, Chinmay Datar, Zahra Monfared, Felix Dietrich

Published: 10 Oct 2024, Last Modified: 07 Dec 2024NeurIPS 2024 WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: recurrent neral networks, non-convex optimization, parametrization, convergence analysis, loss landscape analysis

TL;DR: We investigate the properties of loss landscapes under canonical and modal parametrizations of recurrent neural networks

Abstract: The characteristics of the loss landscape are vital for ensuring efficient gradient-based optimization of recurrent neural networks (RNNs). Learning dynamics in continuous-time RNNs are prone to plateauing effects, with recent studies focusing on this issue by analyzing loss landscapes, particularly in the setting of linear time-invariant (LTI) systems. Building on this work, we explore a fairly simplified setting and study the loss landscape under modal and canonical parametrizations, derived from their respective state-space realizations. We find that canonical parametrization offers improved quasi-convexity properties and faster learning compared to modal forms. Theoretical results are corroborated by numerical experiments. We also show that autonomous ReLU-based RNNs in a modal structure generate trajectories which can be produced by an LTI systems while those with a canonical structure produce more complex trajectories beyond the scope of LTI systems.

Submission Number: 65

Loading