A Linear Network Theory of Iterated Learning

Published: 10 Oct 2024, Last Modified: 25 Dec 2024NeurIPS'24 Compositional Learning Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Iterated Learning, Learning Neural Networks, Systematic Generalization
TL;DR: We provide linear neural network dynamics for the iterated learning algorithm and use this to study the potential benefits of the algorithm.
Abstract: Language provides one of the primary examples of human's ability to systematically generalize --- reasoning about new situations by combining aspects of previous experiences. Consequently modern machine learning has drawn much inspiration from linguistics. A recent example is iterated learning, a procedure where generations of networks learn from the output of earlier learners. The result is a refinement of the network's ``language'' or output labels for given inputs towards compositional structure. Yet, studies of iterated learning and its application to machine learning have remained empirical. Here we theoretically study the emergence of compositional language, and the ability of simple neural networks to leverage this compositionality to systematically generalize. We build on prior theoretical work on linear networks, which mathematically defines systematic generalization, by extending the analysis of shallow and deep linear network learning dynamics to the iterated learning procedure by deriving exact dynamics to the learning over generations. Our results confirm a long standing conjecture: that multiple generations of iterated learning are required for compositional structure to emerge, which can outperform a single generation network trained with optimal early-stopping. Finally, we show that IL requires depth in the network architecture to be effective and that IL is able to extract modules which systematically generalize.
Submission Number: 18
Loading