Efficiently Representing Finite-state Automata With Recurrent Neural Networks

Anej Svete, Ryan Cotterell

01 Feb 2024OpenReview Archive Direct UploadReaders: Everyone

Abstract: Understanding neural network architectures with formal models of computation promises to spark a better understanding of the network's capabilities and limitations. A long line of work has described recurrent neural networks (RNN) in terms of their connection to the well-understood finite-state automata (FSAs), whose sequential nature provides a useful analogy to how RNNs function. Minsky's [1954] construction first showed how RNNs can simulate FSAs and provided a way of understanding RNNs as FSAs. This paper presents a comprehensive review of this construction along with two additional classical results showcasing the relationship between RNNs and FSAs: The constructions due to Dewdney [1977] and Indyk [1995]. We are not only interested in \emph{whether} an RNN can simulate an FSA, but also in the space requirements to do so: Whereas Minsky [1954] shows that an RNN can simulate an FSA with $N$ states using $\mathcal{O}\left(N\right)$ neurons, Dewdney [1977] improved this to $\mathcal{O}\left(N^\frac{3}{4}\right)$ and Indyk [1995] further to $\mathcal{O}\left(\sqrt{N}\right)$, which he also showed to be optimal. We discuss the constructions, emphasizing their commonalities, and put them into the context of more modern research, focusing on the representational capacity of neural language models.

0 Replies