Keywords: agentic systems, tool shortlisting, tabular models, textual–tabular classifier, execution traces, schema/state/dependency features, synthetic supervision, routing/gating, efficiency
TL;DR: Single-pass textual–tabular head (TabAgent) replaces generative decision modules in agents, preserving tool shortlisting task quality while cutting latency by ~95% and cost by 85–91%.
Abstract: Agentic systems often implement routing, shortlisting, gating, and verification with repeated frontier-LLM calls, which accumulates token/latency costs over a run. We introduce TabAgent, a framework that reframes such closed-set decision heads as textual–tabular classification trained on signals extracted from execution traces. TabAgent comprises (i) TabSchema, which distills schema, state, and dependency features from trajectories; (ii) TabSynth, which adds schema-aligned synthetic supervision to improve coverage of rare but decision-critical patterns; and (iii) TabHead, a compact classifier that outputs calibrated probabilities for each candidate in a single forward pass. Evaluated as a drop-in replacement for the GPT-based shortlister in IBM CUGA on AppWorld, TabAgent maintains shortlist quality—e.g., Recall@7 ≥ 0.88 and Recall@9 ≥ 0.92 across five applications—and, with TabSynth, improves macro P@R by +0.14 on average. Critically, TabAgent eliminates shortlist-time LLM calls, achieving a ~95% latency reduction and an 85–91% cost reduction relative to CUGA’s GPT-4.1 shortlister in our setup, while also generalizing to other heads such as application selection and task-complexity gating. These results suggest execution traces expose sufficiently rich, tabular-representable signals to replace generative components with efficient discriminative heads in production agentic architectures.
Primary Area: applications to robotics, autonomy, planning
Submission Number: 11889
Loading