ToolGrad: Efficient Tool-use Dataset Generation with Textual “Gradients”

ToolGrad: Efficient Tool-use Dataset Generation with Textual “Gradients”

ACL ARR 2025 May Submission5218 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Prior work synthesizes tool-use LLM datasets by first generating a user query, followed by complex tool-use annotations like DFS. This inherently leads to inevitable annotation failures and low efficiency in data generation. We introduce ToolGrad, an agentic framework that inverts this paradigm. ToolGrad first constructs valid tool-use chains through an iterative process guided by textual "gradients", and then synthesizes corresponding user queries. This "answer-first" approach led to ToolGrad-5k, a dataset generated with more complex tool use, lower cost, and 100% pass rate. Experiments show that models trained on ToolGrad-5k outperform those trained on expensive baseline datasets and proprietary LLMs, even on OOD benchmarks.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: Tool-use LLMs, LLM/AI agents, Dataset Generation

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources

Languages Studied: English

Submission Number: 5218

Loading