Learning What to Learn: Curriculum Curation for Test-Time Agent Learning

Qizheng Zhang; Sherry Ruan; Shubhangi Upasani; Fenglu Hong; Changxiu Ji; Changran Hu; Bo Li; Hanchen Li; Kunle Olukotun

Learning What to Learn: Curriculum Curation for Test-Time Agent Learning

Qizheng Zhang, Sherry Ruan, Shubhangi Upasani, Fenglu Hong, Changxiu Ji, Changran Hu, Bo Li, Hanchen Li, Kunle Olukotun

Published: 05 Mar 2026, Last Modified: 13 Mar 2026ICLR 2026 Workshop RSI PosterEveryoneRevisionsCC BY 4.0

Keywords: Test-Time Learning, Self-Evolving Agents, Curriculum Learning

Abstract: Test-time learning enables large language model (LLM) agents to adapt during inference without costly retraining, yet prior work largely treats test-time experience as equally useful. We ask a simple question: *what data should agents learn from at test time?* Focusing on task selection and ordering for context-based adaptation, we hypothesize that redundant or overly simple examples offer diminishing returns, while curated curricula improve sample efficiency. Using the Agentic Context Engineering (ACE) framework, we evaluate on the AppWorld benchmark featuring tool-use and coding agents. We show that careful data selection can match full-dataset performance using only $\sim$30\% of training tasks, and that task ordering measurably affects learning outcomes. Our results position curriculum curation as a first-class design dimension for efficient test-time agent learning and practical deployment.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 58

Loading