PolicyBank: Evolving Policy Understanding For Evolving Agents

Published: 02 Mar 2026, Last Modified: 05 Mar 2026LLA 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: evolving agents, evolving policy understanding, lifelong learning, policy alignment
TL;DR: The first memory mechanism for dynamic policy alignment for evolving agents under policy constraints
Abstract: Large language model agents in production environments must not only complete tasks successfully but also comply with policies, such as corporate rules or regulatory constraints that govern allowed actions. However, a fundamental challenge is that these policies are typically specified in natural language by domain users and often lack precise specifications at the tool-call level. Such policy specification gaps cause systematic failures regardless of the agent’s task capability. This motivates us to study the problem of evolving policy understanding. We propose PolicyBank, a memory mechanism that maintains permissible tool-calling preconditions and trajectories by reasoning over past experiences and feedback to refine policy interpretations. For systemic evaluation, we extend tool-calling benchmarks to test whether agents can automatically identify policy gaps and dynamically adapt in continuous task streams. Our evaluations demonstrate that PolicyBank significantly outperforms current popular memory mechanisms under policy-intensive tasks. This work establishes dynamic adaptation of evolving policy understanding as a critical capability for lifelong learning agents and provides a methodology for measuring progress toward agents that proactively refine and align with underspecified constraints.
Submission Number: 146
Loading