Gating Mechanisms Underlying Sequence-to-Sequence Working MemoryDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Working Memory, RNN, Dynamical Systems, Slow Manifold, Gating
Abstract: Working memory is the process by which a system temporarily stores information across a necessary duration. Memory retention and manipulation of discrete sequences are fundamental building blocks for the underlying computation required to perform working memory tasks. Recurrent neural networks (RNNs) have proven themselves to be powerful tools for such problems, as they, through training, bring rise to the dynamical behavior necessary to enact these computations over many time-steps. As of yet, the means by which these learned internal structures of the network result in a desired set of outputs remains broadly elusive. Furthermore, what is known is often difficult to extrapolate from due to a task specific formalism. In this work, we analyze an RNN, trained perfectly on a discrete sequence working memory task, in fine detail. We explain the learned mechanisms by which this network holds memory and extracts information from memory, and how gating is a natural architectural component to achieve these structures. A synthetic solution to a simplified variant of the working memory task is realized. We then explore how these results can be extrapolated to alternative tasks.
One-sentence Summary: A synthetic RNN solution to a sequence-to-sequence working memory task, inspired from analysis on a trained network, is constructed and extrapolated from.
Supplementary Material: zip
11 Replies

Loading