Improving Language Agents through BREW: Bootstrapping expeRientially-learned Environmental knoWledge

Improving Language Agents through BREW: Bootstrapping expeRientially-learned Environmental knoWledge

ICLR 2026 Conference Submission25161 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Language agents, agent memory, computer use agents

Abstract: Large Language Model (LLM)-based agents are increasingly applied to tasks requiring structured reasoning, tool use, and environmental adaptation, such as data manipulation, multistep planning, and computer-use automation. However, despite their versatility, current training paradigms for model weight optimization methods, like PPO and GRPO, remain relatively impractical with their high computational overhead for rollout convergence. In addition, the resulting agent policies are difficult to interpret, adapt, or incrementally improve. To address this, we investigate creating and refining structured memory of experiential learning of an agent from its environment as an alternative route to agent optimization. We introduce \textbf{BREW} (Bootstrapping expeRientially-learned Environmental knoWledge), a framework for agent optimization for downstream tasks via KB construction and refinement. In our formulation, we introduce an effective method for partitioning agent memory for more efficient retrieval and refinement. BREW uses task graders and behavior rubrics to learn insights while leveraging state-space search for ensuring robustness from the noise and non-specificity in natural language. Empirical results on real world, domain-grounded benchmarks---OSWorld and $\tau^2$Bench---show BREW achieves 10--20\% improvement in task precision, 10--15\% reduction in API/tool calls leading to faster execution time, all while maintaining computational efficiency on par with base models. Unlike prior work where memory is treated as static context, we establish the KB as a modular and controllable substrate for agent optimization---an explicit lever for shaping behavior in a transparent, interpretable, and extensible manner.

Supplementary Material: pdf

Primary Area: foundation or frontier models, including LLMs

Submission Number: 25161

Loading