Improving Language Agents through BREW: Bootstrapping expeRientially-learned Environmental knoWledge
Keywords: Language agents, agent memory, computer use agents
Abstract: Large Language Model (LLM)-based agents are increasingly applied to tasks requiring structured reasoning, tool use, and environmental adaptation, such as data manipulation, multistep planning, and computer-use automation. However, despite their versatility, current training paradigms for model weight optimization methods, like PPO and GRPO, remain relatively impractical with their high computational overhead for rollout convergence. In addition, the resulting agent policies are difficult to interpret, adapt, or incrementally improve. To address this, we investigate creating and refining structured memory of experiential learning of an agent from its environment as an alternative route to agent optimization. We introduce \textbf{BREW} (Bootstrapping expeRientially-learned Environmental knoWledge), a framework for agent optimization for downstream tasks via KB construction and refinement. In our formulation, we introduce an effective method for partitioning agent memory for more efficient retrieval and refinement. BREW uses task graders and behavior rubrics to learn insights while leveraging state-space search for ensuring robustness from the noise and non-specificity in natural language. Empirical results on real world, domain-grounded benchmarks---OSWorld and $\tau^2$Bench---show BREW achieves 10--20\% improvement in task precision, 10--15\% reduction in API/tool calls leading to faster execution time, all while maintaining computational efficiency on par with base models. Unlike prior work where memory is treated as static context, we establish the KB as a modular and controllable substrate for agent optimization---an explicit lever for shaping behavior in a transparent, interpretable, and extensible manner.
Supplementary Material: pdf
Primary Area: foundation or frontier models, including LLMs
Submission Number: 25161
Loading