On the theoretical limit of gradient descent for Simple Re- current Neural Networks with finite precision
Abstract: Despite their great practical successes, the understanding of neural network behavior is still
a topical research issue. In particular, the class of functions learnable in the context of a
finite precision configuration is an open question. In this paper, we propose to study the
limits of gradient descent when such a configuration is set for the class of Simple Recurrent
Networks (SRN). We exhibit conditions under which the gradient descend will provably fail.
We also design a class of SRN based on Deterministic finite State Automata (DFA) that
fulfills the failure requirements. The definition of this class is constructive: we propose an
algorithm that, from any DFA, constructs a SRN that computes exactly the same function,
a result of interest by its own.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Lechao_Xiao2
Submission Number: 3124
Loading