Keywords: Context Engineering, Self-Improving LLMs
Abstract: Large language model (LLM) applications such as agents and domain-specific reasoning increasingly rely on context adaptation, modifying model inputs with instructions, strategies, or evidence, rather than weight updates.
While prior methods improve usability, they often suffer from a brevity bias, discarding domain-specific insights in favor of short summaries, and from context collapse, where iterative rewriting erodes details over time.
Building on the adaptive memory introduced by Dynamic Cheatsheet, we present ACE (**A**gentic **C**ontext **E**ngineering), a framework that treats contexts as evolving playbooks that accumulate, refine, and organize strategies through a modular process of generation, reflection, and curation.
ACE prevents collapse by applying structured, incremental updates that preserve detailed knowledge and scale with long-context models.
Across agentic and domain-specific benchmarks, ACE consistently outperforms strong baselines, improving application performance by 9.0\% while reducing adaptation latency and rollout cost.
Notably, ACE could adapt effectively without labeled supervision, instead leveraging natural execution feedback, and on the AppWorld leaderboard it matches the top-1-ranked production-level agent while using a smaller open-source model.
These results demonstrate that comprehensive, evolving contexts enable scalable, efficient, and self-improving LLM systems.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 16475
Loading