On the theoretical limit of gradient descent for Simple Recurrent Neural Networks with finite precision

Volodimir Mitarchuk; Rémi Emonet; Remi Eyraud; Amaury Habrard

On the theoretical limit of gradient descent for Simple Recurrent Neural Networks with finite precision

Volodimir Mitarchuk, Rémi Emonet, Remi Eyraud, Amaury Habrard

Published: 19 Dec 2024, Last Modified: 19 Dec 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Despite their great practical successes, the understanding of neural network behavior is still a topical research issue. In particular, the class of functions learnable in the context of a finite precision configuration is an open question. In this paper, we propose to study the limits of gradient descent when such a configuration is set for the class of Simple Recurrent Networks (SRN). We exhibit conditions under which the gradient descend will provably fail. We also design a class of SRN based on Deterministic finite State Automata (DFA) that fulfills the failure requirements. The definition of this class is constructive: we propose an algorithm that, from any DFA, constructs a SRN that computes exactly the same function, a result of interest by its own.

Submission Length: Long submission (more than 12 pages of main content)

Video: https://www.youtube.com/watch?v=ap6LOok_Vtk&ab_channel=VolodimirMitarchuk

Code: https://github.com/23Vladymir57/TMLR_Code

Assigned Action Editor: ~Lechao_Xiao2

Submission Number: 3124

Loading