## Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks

Sep 06, 2019 NeurIPS 2019 readers: everyone
• Abstract: We propose a novel memory unit for recurrent neural networks. Both the architecture and the initial weights of our Legendre Memory Unit~(LMU) are mathematically derived to orthogonally decompose its continuous-time history. Each unit does so by solving $d$ coupled ordinary differential equations (ODEs), whose phase space linearly maps onto sliding windows of time via the Legendre polynomials up to degree $d - 1$. Backpropagation across stacked layers of LMU cells outperforms equivalently-sized LSTM networks by more than two orders of magnitude on a memory capacity benchmark, while significantly reducing training and inference times. LMUs can handle temporal dependencies spanning $T = \numprint{100000}$ time-steps, converge rapidly, and use relatively few internal state-variables to learn complex functions across long windows of time -- exceeding state-of-the-art performance among RNNs on permuted sequential MNIST. These results are due to the network's disposition to learn scale-invariant features independently of step size. Backpropagation through the ODE solver allows each unit to adapt its internal time-step, enabling the network to learn task-relevant time-scales. We demonstrate that LMU memory cells can be implemented using $m$ recurrently-connected Poisson spiking neurons, $\mathcal{O}( m )$ time and memory, with error scaling as $\mathcal{O}( d / \sqrt{m} )$.