TabAgent: A Framework for Replacing Agentic Generative Components with Tabular-Textual Classifiers

Eilam Shapira; Ido Levy; Yinon Goldshtein; Avi Yaeli; Nir Mashkif; Segev Shlomov

TabAgent: A Framework for Replacing Agentic Generative Components with Tabular-Textual Classifiers

Eilam Shapira, Ido Levy, Yinon Goldshtein, Avi Yaeli, Nir Mashkif, Segev Shlomov

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: agentic systems, tool shortlisting, tabular models, textual–tabular classifier, execution traces, schema/state/dependency features, synthetic supervision, routing/gating, efficiency

TL;DR: Single-pass textual–tabular head (TabAgent) replaces generative decision modules in agents, preserving tool shortlisting task quality while cutting latency by ~95% and cost by 85–91%.

Abstract: Agentic systems often implement routing, shortlisting, gating, and verification with repeated frontier-LLM calls, which accumulates token/latency costs over a run. We introduce TabAgent, a framework that reframes such closed-set decision heads as textual–tabular classification trained on signals extracted from execution traces. TabAgent comprises (i) TabSchema, which distills schema, state, and dependency features from trajectories; (ii) TabSynth, which adds schema-aligned synthetic supervision to improve coverage of rare but decision-critical patterns; and (iii) TabHead, a compact classifier that outputs calibrated probabilities for each candidate in a single forward pass. Evaluated as a drop-in replacement for the GPT-based shortlister in IBM CUGA on AppWorld, TabAgent maintains shortlist quality—e.g., Recall@7 ≥ 0.88 and Recall@9 ≥ 0.92 across five applications—and, with TabSynth, improves macro P@R by +0.14 on average. Critically, TabAgent eliminates shortlist-time LLM calls, achieving a ~95% latency reduction and an 85–91% cost reduction relative to CUGA’s GPT-4.1 shortlister in our setup, while also generalizing to other heads such as application selection and task-complexity gating. These results suggest execution traces expose sufficiently rich, tabular-representable signals to replace generative components with efficient discriminative heads in production agentic architectures.

Primary Area: applications to robotics, autonomy, planning

Submission Number: 11889

Loading