Keywords: LLM Agents, Large Language Models, Optimization, Tool Calls, Caching, Planning
Abstract: The rapid advancement of large language models (LLMs) is driving the emergence of LLM agents. Unlike standalone LLMs, these agents interact dynamically with their environment, employing tools, multi-step processes, and even multiple LLMs to enhance functionality. Optimizing tool usage is critical for LLM agents. In this paper, we introduce ToolCacheAgent, an adaptive “agent-for-agents” that automatically caches tool call results to improve response time and reduce redundant computation. For each tool in the agent workflow, ToolCacheAgent generates a caching plan that specifies cacheability, expiration, and inter-tool invalidation rules to maintain correctness in stateful executions. It continuously monitors runtime signals and adapts its cache policies to handle shifting workloads and memory pressure. We evaluate ToolCacheAgent across a range of agent workloads with diverse tool usage patterns and observe up to a 1.69$\times$ latency speed-up without compromising accuracy.
Primary Area: infrastructure, software libraries, hardware, systems, etc.
Submission Number: 8974
Loading