Adaptive Meta-Curriculum for Test-Time Self-Improvement

Published: 05 Mar 2026, Last Modified: 13 Mar 2026ICLR 2026 Workshop RSI SpotlightEveryoneRevisionsCC BY 4.0
Keywords: Test-Time Compute, Meta-Curriculum, Recursive Self-Improvement
TL;DR: AMC-TSI meta-learns a dynamic curriculum to allocate test-time compute and select optimal improvement strategies, significantly boosting efficiency over uniform methods
Abstract: Recursive self-improvement (RSI) in large language models has shown promise through test-time compute scaling, but current approaches allocate computational resources uniformly across problems, ignoring heterogeneous difficulty and the varying effectiveness of different improvement strategies. We introduce Adaptive Meta-Curriculum for Test-Time Self-Improvement (AMC-TSI), a framework that meta-learns both a curriculum scheduler to predict compute-benefit trade-offs and an adaptive improvement operator that selects among revision, search, and reflection strategies based on problem characteristics. Our key contributions are: (1) a theoretical framework proving that adaptive compute allocation under learned curricula achieves superior sample complexity bounds compared to uniform allocation, (2) a meta-learning algorithm that jointly optimizes curriculum difficulty estimation and improvement operator selection, and (3) empirical validation showing 2.3$\times$ improved compute efficiency and 18.7% higher accuracy on mathematical reasoning benchmarks. AMC-TSI addresses a critical gap in RSI systems by enabling efficient, problem-aware self-improvement without manual tuning.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 123
Loading