Tool Cache Agent: Accelerating LLM Agent Through Intelligent Tool Call Caching

Jiwoong Kwon

Tool Cache Agent: Accelerating LLM Agent Through Intelligent Tool Call Caching

Jiwoong Kwon

17 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Agents, Large Language Models, Optimization, Tool Calls, Caching, Planning

Abstract: The rapid advancement of large language models (LLMs) is driving the emergence of LLM agents. Unlike standalone LLMs, these agents interact dynamically with their environment, employing tools, multi-step processes, and even multiple LLMs to enhance functionality. Optimizing tool usage is critical for LLM agents. In this paper, we introduce ToolCacheAgent, an adaptive “agent-for-agents” that automatically caches tool call results to improve response time and reduce redundant computation. For each tool in the agent workflow, ToolCacheAgent generates a caching plan that specifies cacheability, expiration, and inter-tool invalidation rules to maintain correctness in stateful executions. It continuously monitors runtime signals and adapts its cache policies to handle shifting workloads and memory pressure. We evaluate ToolCacheAgent across a range of agent workloads with diverse tool usage patterns and observe up to a 1.69$\times$ latency speed-up without compromising accuracy.

Primary Area: infrastructure, software libraries, hardware, systems, etc.

Submission Number: 8974

Loading