Gradient-based learning drives robust representations in recurrent neural networks by balancing compression and expansion

Matthew Farrell, Stefano Recanatesi, Timothy Moore, Guillaume Lajoie, Eric Shea-Brown

Published: 22 Jun 2022, Last Modified: 07 May 2026Nature Machine IntelligenceEveryoneRevisionsCC BY-SA 4.0

Abstract: Neural networks need the right representations of input data to learn. Here we ask how gradient-based learning shapes a fundamental property of representations in recurrent neural networks (RNNs)—their dimensionality. Through simulations and mathematical analysis, we show how gradient descent can lead RNNs to compress the dimensionality of their representations in a way that matches task demands during training while supporting generalization to unseen examples. This can require an expansion of dimensionality in early timesteps and compression in later ones, and strongly chaotic RNNs appear particularly adept at learning this balance. Beyond helping to elucidate the power of appropriately initialized artificial RNNs, this fact has implications for neurobiology as well. Neural circuits in the brain reveal both high variability associated with chaos and low-dimensional dynamical structures. Taken together, our findings show how simple gradient-based learning rules lead neural networks to solve tasks with robust representations that generalize to new cases. Neural networks in the brain often exhibit chaotic dynamics that can be captured by a small number of dimensions. Farrell et al. find that recurrent neural networks trained with gradient-based learning rules exhibit similar features. This helps form robust but generalizable input representations.

External IDs:doi:10.1038/s42256-022-00498-0