Keywords: tool use; function calling; self-corretion; reinforcement learning
Abstract: Large language models equipped with external tools have shown strong potential as general-purpose agents, yet they remain brittle when tool execution fails. Existing approaches largely rely on heuristic self-correction or imitation of error-fix patterns, making it difficult for agents to reliably recover from tool-calling errors, especially in multi-step settings. In this paper, we present TALE, a reinforcement learning framework that enables LLM-based agents to acquire robust self-correction behaviors by explicitly learning from their own tool-calling errors. TALE formulates error recovery as a sequential decision-making problem and introduces a progress-aware reward that captures incremental improvement across consecutive repair attempts, rather than relying solely on sparse success signals. This design encourages directional exploration and discourages repetitive or unproductive behaviors. Extensive experiments on four representative tool-using benchmarks demonstrate that TALE consistently improves task success and correction efficiency, achieving significant gains over strong baselines on challenging settings such as BFCL-v3. Our results show that learning from error experience is a key step toward more reliable and adaptive tool-using agents.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: tool use; function calling; reinforcement learning in agents
Contribution Types: Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 7588
Loading