Modeling Tool Use in Transformers via Computation Oracles

Published: 02 Mar 2026, Last Modified: 02 Mar 2026LIT Workshop @ ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 10 pages)
Keywords: Tools, Agents, Function Calls, Expressivity, Oracles, Computation Theory, Complexity Theory
TL;DR: We propose a new formal framework for analysing the popular setting of allowing agents access to tools/function calls and draw equivalence with test-time depth scaling techniques and prove addition results.
Abstract: Prior literature has mapped the transformer architecture to classical models of computation and especially via circuit complexity, and analyzed how the expressive power gets enhanced by adding computational resources like Chain-of-Thought, padding tokens, and depth scaling. Adding tools or function calls to transformer models has shown impressive empirical gains but their theoretical understanding remains under-studied. In this paper, we analyze function calls as $oracles$ in classical complexity theory and provide a formal framework to analyze the expressive gains of augmenting transformer models with function calls and agentic cooperation. While previous works has analyzed multi agent systems as graph problems and interactive proof systems, we formalize tool use as oracle access, yielding a more general, model-agnostic framework that subsumes particular interaction patterns and enables explicit complexity-theoretic characterizations, equivalence theorems, and conditional separations on expressive power. We show that the tool-use setting is strictly more powerful than $\log$-depth transformers under similar resource bounds. Finally, we show that access to threshold decision oracles suffices to compute associated optimization objectives via binary search using only $O(\log B(n))$ adaptive queries, with all between-query control logic implementable in constant-depth softmax transformer. Together, these results provide a complexity-theoretic framework for understanding tool use as an alternative mechanism for scaling test-time compute and recovering expressivity gains analogous to dynamic depth, while clarifying when tool access can (and cannot) exceed the power of $\log$-depth transformers.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Utkarsh_Tiwari1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 74
Loading