Keywords: continual learning, compositionality, contextual inference, low-rank RNN
TL;DR: A two-system RNN achieves continual learning by using a 'what' system to infer compositional structures in tasks and a 'how' system to compose low-rank components, demonstrating transfer and compositional generalization.
Abstract: The ability to continually learn new skills, retain, and flexibly deploy them to accomplish goals is a key feature of intelligent and efficient behavior. However, the neural mechanisms facilitating the continual learning and flexible (re-)composition of skills remain elusive. Here, we study continual learning and the compositional reuse of learned computations in recurrent neural network (RNN) models using a novel two-system approach: one system that infers 'what' computation to perform, and one that implements 'how' to perform it. We focus on a set of compositional cognitive tasks commonly studied in neuroscience. To construct the 'what' system, we first show that a large family of tasks can be systematically described by a probabilistic generative model, where compositionality stems from a shared underlying vocabulary of discrete task-epochs. The shared epoch structure makes these tasks inherently compositional. We first show that this compositionality can be systematically described by a probabilistic generative model. Furthermore, we develop an unsupervised online learning approach that can learn this model on a single-trial basis, building its vocabulary incrementally as it is exposed to new tasks, and inferring the latent epoch structure as a time-varying computational context within a trial. We implement the 'how' system as an RNN whose low-rank components are composed according to the context inferred by the 'what' system.
The contextual inference facilitates the creation, learning, and reuse of the low-rank RNN components as new tasks are introduced sequentially, enabling continual learning without catastrophic forgetting. Using an example task set, we demonstrate the efficacy and competitive performance of this two-system learning framework, its potential for forward and backward transfer, as well as few-shot learning via re-composition.
Supplementary Material: zip
Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)
Submission Number: 22391
Loading