How do language models learn facts? Dynamics, curricula and hallucinations

Nicolas Zucchet; Jorg Bornschein; Stephanie C.Y. Chan; Andrew Kyle Lampinen; Razvan Pascanu; Soham De

How do language models learn facts? Dynamics, curricula and hallucinations

Nicolas Zucchet, Jorg Bornschein, Stephanie C.Y. Chan, Andrew Kyle Lampinen, Razvan Pascanu, Soham De

Published: 08 Jul 2025, Last Modified: 26 Aug 2025COLM 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: learning dynamics, factual recall, curricula, data distribution, hallucinations

TL;DR: We analyze learning dynamics of language models on a synthetic memory task and show that they learn sequentially, that some data distribution properties lead to faster learning, and that hallucinations appear simulataneously to knowledge acquisition.

Abstract: Large language models accumulate vast amounts of knowledge during their pre-training, yet the dynamics governing this acquisition remain poorly understood. This work investigates the learning dynamics of language models on a synthetic factual recall task, uncovering three key findings: First, language models learn in three phases, with performance plateauing before they acquire precise factual knowledge. Mechanistically, this plateau coincides with the formation of attention-based circuits that support recall. Second, the training data distribution significantly impacts learning dynamics, with imbalanced distributions shortening the plateau. Finally, hallucinations appear simultaneously to knowledge, and integrating new knowledge into the model through fine-tuning is challenging, as it quickly corrupts its existing parametric associative memories. Our results emphasize the importance of data distribution in knowledge acquisition and suggest novel data scheduling strategies to accelerate neural network training.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 294

Loading