AgentVocab: Structure-Aware Vocabulary Adaptation for Efficient LLM Agents

Published: 30 Apr 2026, Last Modified: 24 Jun 2026ICML 2026 regularEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We introduce AgentVocab, a structure-aware vocabulary adaptation method for LLM agents that improves decoding efficiency in tool-calling scenarios while preserving task accuracy, without modifying model architectures.
Abstract: Recent large language models (LLMs) have demonstrated strong capabilities across challenging tasks, enabling their widespread adoption in agentic systems that interact with external tools. In such deployments, however, LLMs are typically trained with general-purpose tokenizers designed for broad language coverage, while their usage is dominated by narrow, structured tool-calling interactions. This training–deployment mismatch leads to inefficient tokenization, where repetitive structural patterns and frequent semantic units in function calls are fragmented into long sequences of low-level tokens, increasing decoding overhead. To address this gap, we introduce $\textbf{AgentVocab}$, a structure-aware vocabulary adaptation framework for efficient LLM agents. AgentVocab derives specialized vocabulary entries from real tool-calling traces and adapts the model vocabulary to better reflect structural and semantic regularities, without task-specific schema engineering. Experiments on $\tau$-bench and $\tau^2$-bench show that AgentVocab preserves tool-calling performance while reducing latency relative to the vanilla baseline by 17.7\% and 19.5\%, respectively. Our approach is orthogonal to existing fine-tuning and agent-training methods and integrates seamlessly into standard agent pipelines. Source code and models will be available at https://github.com/Starry-159/AgentVocab.
Lay Summary: Modern AI assistants increasingly need to use outside tools, such as databases, search systems, or service platforms, to help people complete real tasks. These assistants often repeat the same kinds of structured messages when choosing tools, sending requests, and reading tool results. However, current language models usually break these repeated patterns into many tiny pieces of text, making each interaction longer and slower. We introduce AgentVocab, a method that teaches the model to treat common tool-use patterns as reusable text units. Instead of changing what the assistant is asked to do, AgentVocab changes how repeated tool-related text is represented, so the assistant can process and generate it more efficiently. In experiments on standard tests for tool-using AI assistants, this makes interactions about one fifth faster while keeping task performance at a similar level. This work shows that improving the way AI assistants represent repeated tool-use patterns can make them faster, cheaper to run, and more practical for real-world applications.
Originally Submitted Supplementary Material: zip
Link To Code: https://github.com/Starry-159/AgentVocab
Primary Area: Deep Learning->Large Language Models
Keywords: Large Language Models, Agent, Vocabulary Adaptation
Originally Submitted PDF: pdf
Submission Number: 25422
Loading