Reasoning Curriculum: Bootstrapping Broad LLM Reasoning from Math

ICLR 2026 Conference Submission19515 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, Reasoning
TL;DR: Start RL on math, then train jointly across domains to reliably boost general reasoning in LLMs.
Abstract: Reinforcement learning (RL) can elicit strong reasoning in large language models (LLMs), yet most open efforts focus on math and code. We propose $\textbf{\texttt{Reasoning Curriculum}}$, a simple two-stage curriculum that first elicits reasoning skills in pretraining-aligned domains such as math, then adapts and refines these skills across other domains via joint RL. Stage 1 performs a brief cold start and then math-only RL with verifiable rewards to develop reasoning skills. Stage 2 runs joint RL on mixed-domain data to transfer and consolidate these skills. The curriculum is minimal and backbone-agnostic, requiring no specialized reward models beyond standard verifiability checks. Evaluated on Qwen3-4B and Llama-3.1-8B over a multi-domain suite, $\texttt{Reasoning Curriculum}$ yields consistent gains. Ablations and a cognitive-skill analysis indicate that both stages are necessary and that math-first elicitation increases cognitive behaviors important for solving complex problems. $\texttt{Reasoning Curriculum}$ provides a compact, easy-to-adopt recipe for general reasoning.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 19515
Loading