On the theoretical limit of gradient descent for Simple Re- current Neural Networks with finite precision

TMLR Paper3124 Authors

03 Aug 2024 (modified: 18 Nov 2024)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Despite their great practical successes, the understanding of neural network behavior is still a topical research issue. In particular, the class of functions learnable in the context of a finite precision configuration is an open question. In this paper, we propose to study the limits of gradient descent when such a configuration is set for the class of Simple Recurrent Networks (SRN). We exhibit conditions under which the gradient descend will provably fail. We also design a class of SRN based on Deterministic finite State Automata (DFA) that fulfills the failure requirements. The definition of this class is constructive: we propose an algorithm that, from any DFA, constructs a SRN that computes exactly the same function, a result of interest by its own.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Lechao_Xiao2
Submission Number: 3124
Loading