Abstract: The operational efficacy of large language models relies heavily on their inference-time context. This has established Context Engineering (CE) as a formal discipline for optimizing these inputs. Current CE methods rely on manually crafted harnesses, such as rigid generation-reflection workflows and predefined context schemas. They impose structural biases and restrict context optimization to a narrow, intuition-bound design space. To address this, we introduce Meta Context Engineering (MCE), a bi-level framework that supersedes static CE heuristics by co-evolving CE skills and context artifacts. In MCE iterations, a meta-level agent refines engineering skills via agentic crossover, a deliberative search over the history of skills, their executions, and evaluations. A base-level agent executes these skills, learns from training rollouts, and optimizes context as flexible files and code. We evaluate MCE across five disparate domains under offline and online settings. MCE demonstrates consistent performance gains, achieving 5.6--53.8% relative improvement over state-of-the-art agentic CE methods (mean of 16.9%), while maintaining superior context adaptability, transferability, and efficiency in both context usage and training. Code is available at \url{https://github.com/henry-yeh/mce}.
Lay Summary: Many AI systems perform much better when they are given the right background information, examples, rules, or instructions before answering a question. Today, people often design this extra information by hand, using fixed templates or workflows. This can work, but it also limits what the AI system can learn, because the format and update process are decided in advance. This paper introduces Meta Context Engineering, a method that lets an AI agent improve both the information it uses and the way that information is created. One agent learns better strategies for building useful context, while another agent applies those strategies to organize examples, rules, files, and code for a specific task. We test this approach on tasks from finance, chemistry, medicine, law, and AI safety. Across these settings, the method improves performance over prior context-engineering approaches, while often using context more efficiently and requiring less training work. The broader goal is to make AI systems better at adapting to specialized domains without needing to retrain the underlying model.
Originally Submitted Supplementary Material: zip
Link To Code: https://github.com/henry-yeh/mce
Primary Area: Deep Learning->Large Language Models
Keywords: Large language model, context engineering, agent skills, evolutionary computation
Originally Submitted PDF: pdf
Submission Number: 21111
Loading