PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases

Sri Vatsa Vuddanti; Aarav Shah; Satwik Kumar Chittiprolu; Tony Song; Sunishchal Dev; Kevin Zhu; Maheep Chaudhary

PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases

Sri Vatsa Vuddanti, Aarav Shah, Satwik Kumar Chittiprolu, Tony Song, Sunishchal Dev, Kevin Zhu, Maheep Chaudhary

Published: 10 Jan 2026, Last Modified: 11 Jan 2026LaMAS 2026 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Tool-augmented agents, failure recovery, robust reasoning, error handling, fine-tuning, case-based retrieval, fault tolerance, API failures, agent reliability, ToolBench

TL;DR: PALADIN trains agents to recover from tool failures using 50K+ scenarios and 55+ recovery examples, achieving 89.68% recovery rate (vs 23.75-76.34% baselines) and 95.2% generalization to unseen failures.

Abstract: Tool-augmented language agents routinely fail in deployment due to execution-time tool errors such as timeouts, malformed outputs, or silent API failures. In agentic and multi-agent systems, these failures are especially damaging: a single unhandled error can cascade across reasoning steps or agents, leading to deadlock or hallucinated success. Despite this, most training pipelines optimize only for clean, successful trajectories and leave execution-level recovery largely unmodeled. We propose PALADIN, a framework for teaching language agents explicit, generalizable recovery behavior under tool failures. PALADIN trains agents on over 50{,}000 recovery-annotated trajectories generated via systematic failure injection aligned with the ToolScan taxonomy, while preserving base task competence through LoRA-based fine-tuning. At inference time, agents detect execution failures and condition their responses on a small, curated bank of taxonomy-aligned recovery exemplars, enabling structured diagnosis and repair rather than reactive retries. Across multiple backbones and evaluation settings, PALADIN consistently improves execution-level robustness. On deployment-relevant benchmarks, it raises Recovery Rate from 32.8\% to 89.7\%, reduces catastrophic (hallucinated) success, and substantially increases task completion, while incurring only modest efficiency overhead. Crucially, PALADIN generalizes to unseen tools and failure types, retaining over 95\% recovery rate on out-of-distribution APIs. These results demonstrate that execution-level recovery is a learnable and transferable capability. By treating tool failure as a first-class training signal, PALADIN provides a practical foundation for building reliable, failure-aware language agents and offers a pathway toward safer and more robust LLM-based agentic and multi-agent systems.

Submission Number: 22

Loading