Towards Budget-Aware Agents: Do Agents Know What They Will Spend?

Towards Budget-Aware Agents: Do Agents Know What They Will Spend?

ACL ARR 2026 May Submission15737 Authors

26 May 2026 (modified: 02 Jun 2026)ACL ARR 2026 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: budget-aware agents, deployment, social aspect, llm agents, multi-turn RL, agentic AI

Abstract: Foundation-model agents operate under growing resource constraints yet rarely know how much budget they will spend. We call this capability budget awareness and formalize it as progressive interval estimation: mid-execution, can the agent provide a calibrated interval over remaining budget and declare when the task is infeasible? We score this with a rollout-replay protocol that re-queries the agent on every trajectory prefix, decomposing estimation into feasibility prediction, early failure detection, and interval calibration. We evaluate five frontier models across four environments, including internal token budgets (Sokoban, Search-R1, SWE-bench) and external multi-dimensional budgets (Warehouse), and train Qwen-7B estimators with SFT and RL. We find budget awareness: (1) decouples from task performance, (2) fails in structured ways (universal optimistic bias, late failure recognition, calibration-bound feasibility vs. reasoning-bound intervals), and (3) is actionable via early stopping and trainable via SFT-then-RL as a control signal that resource-limited agents currently lack.

Paper Type: Long

Research Area: Information Retrieval and Text Mining

Research Area Keywords: calibration/uncertainty, probing, robustness, agent evaluation, environment interaction, LLM efficiency, tool use, planning in agents

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

EMNLP 2026 AI Reviewing Experiment: yes

Submission Number: 15737

Loading