CurLL: Curriculum Learning of Language Models

Published: 23 Sept 2025, Last Modified: 11 Nov 2025CCFM OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Continual Learning, Curriculum, benchmark, synthetic data, language modelling
TL;DR: A dataset grounded in Human Education curriculum to train and evaluate continual learning of language models
Abstract: We introduce a comprehensive continual learning dataset and benchmark CurLL grounded in human developmental trajectories from ages 5–10, enabling systematic and fine-grained assessment of models’ ability to progressively acquire new skills. CurLL spans five developmental stages (0–4) covering ages 5–10, with a skill graph of 32 high-level skills, 128 sub-skills, 350+ goals, and 1,300+ indicators explicitly modeling prerequisite relationships. We generate a 23.4B-token synthetic dataset with controlled skill progression, vocabulary complexity, and format diversity, comprising paragraphs, comprehension-based QA (CQA), skill-testing QA (CSQA), and instruction–response (IR) pairs. Stage-wise token counts range from 2.12B to 6.78B tokens, supporting precise analysis of forgetting, forward transfer, and backward transfer. Using a 135M-parameter transformer trained under independent, joint, and sequential (continual) setups, we show trade-offs in skill retention and transfer efficiency. By mirroring human learning patterns and providing fine-grained control over skill dependencies, this work advances continual learning evaluations for language models.
Serve As Reviewer: ~Shubhra_Mishra1
Submission Number: 8
Loading