Keywords: recurrent neural network, RNN, long short-term memory, LSTM, gated recurrent network, GRU, dynamical mathematics, interpretability
TL;DR: This paper modelled cell states of LSTMs and GRUs as dynamic Hopfield networks to present the novel light-weight RNN of EINS with either comparable, or better performances than the LSTM in a wide range of tasks.
Abstract: This paper contrasts the two canonical recurrent neural networks (RNNs) of long short-term memory (LSTM) and gated recurrent unit (GRU) to propose our novel light-weight RNN of Extrapolated Input for Network Simplification (EINS). We treat LSTMs and GRUs as differential equations, and our analysis highlights several auxiliary components in the standard LSTM design that are secondary in importance. Guided by these insights, we present a design that abandons the LSTM redundancies, thereby introducing EINS. We test EINS against the LSTM over a carefully chosen range of tasks from language modelling and medical data imputation-prediction through a sentence-level variational autoencoder and image generation to learning to learn to optimise another neural network. Despite having both a simpler design and fewer parameters, this simplification either performs comparably, or better, than the LSTM in each task.
Original Pdf: pdf
4 Replies
Loading