Mechanisms of skill transfer from pretraining to target task in recurrent neural networks

Mechanisms of skill transfer from pretraining to target task in recurrent neural networks

ICLR 2026 Conference Submission22036 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: RNN, dynamical systems, learning theory, curriculum learning

TL;DR: Pretraining on sub-tasks of the target yields faster training and better representations by compositionally adapting RNN dynamical systems features.

Abstract: Pretraining on simpler tasks can often improve learning outcomes on a more difficult target task. Nonetheless, what makes for a good pretraining curriculum and the mechanisms of positive transfer across tasks remain poorly understood. Here we use RNNs trained on fixed length temporal integration to compare curricula with varying degrees of effectiveness. We show that pretraining on simpler versions of the target task is less effective than curricula which take advantage of the target task's compositional structure and train sub-skills needed for solving it. By exploiting the highly structured solution of our target task, we can mechanistically explain improvements in speed and quality of learning in terms of the slow features of the RNN dynamics that the curriculum helps build, and the reuse and adaptation of those slow features during target training. Our results argue that pretraining on tasks that individually hone sub-skills required for the target are particularly beneficial, as they build a scaffolding on which additional dynamical systems structures can be compositionally expanded to achieve the final function. Thus, our results document a novel mechanism for repurposing dynamical systems features in support of cognitive flexibility.

Supplementary Material: zip

Primary Area: applications to neuroscience & cognitive science

Submission Number: 22036

Loading