Task-Aligned Tool Recommendation for Large Language Models

Task-Aligned Tool Recommendation for Large Language Models

ACL ARR 2025 July Submission1423 Authors

29 Jul 2025 (modified: 20 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: By augmenting Large Language Models (LLMs) with external tools, their capacity to solve complex problems has been significantly enhanced. However, despite ongoing advancements in the parsing capabilities of LLMs, incorporating all available tools simultaneously in the prompt remains impractical due to the vast number of external tools. Consequently, it is essential to provide LLMs with a precise set of tools tailored to the specific task, considering both quantity and quality. Current tool retrieval methods primarily focus on refining the ranking list of tools and directly packaging a fixed number of top-ranked tools as the tool set. However, these approaches often fail to equip LLMs with the optimal set of tools prior to execution, since the optimal number of tools for different tasks could be different, resulting in inefficiencies such as redundant or unsuitable tools, which impede immediate access to the most relevant tools. This paper addresses the challenge of recommending precise toolsets for LLMs. We introduce the problem of tool recommendation, define its scope, and propose a novel Precision-driven Tool Recommendation (PTR) approach. PTR captures an initial, concise set of tools by leveraging historical tool bundle usage and dynamically adjusts the tool set by performing tool matching, culminating in a multi-view-based tool addition. Additionally, we present a new dataset, RecTools, and a metric, TRACC, designed to evaluate the effectiveness of tool recommendation for LLMs. We further validate our design choices through comprehensive experiments, demonstrating promising accuracy across two open benchmarks and our RecTools dataset

Paper Type: Long

Research Area: Generation

Research Area Keywords: few-shot generation, domain adaptation, text-to-text generation, inference methods;

Contribution Types: Publicly available software and/or pre-trained models

Languages Studied: English

Reassignment Request Area Chair: This is not a resubmission

Reassignment Request Reviewers: This is not a resubmission

Data: zip

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: No

A2 Elaboration: There is no risk in our research

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: Section.3 and Appendix.C

B2 Discuss The License For Artifacts: N/A

B2 Elaboration: N/A

B3 Artifact Use Consistent With Intended Use: N/A

B4 Data Contains Personally Identifying Info Or Offensive Content: N/A

B5 Documentation Of Artifacts: Yes

B5 Elaboration: Appendix.C

B6 Statistics For Data: Yes

B6 Elaboration: Section.3 and Appendix.C

C Computational Experiments: Yes

C1 Model Size And Budget: N/A

C1 Elaboration: Section4

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: Section4

C3 Descriptive Statistics: Yes

C3 Elaboration: Section4

C4 Parameters For Packages: N/A

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: No

D1 Elaboration: No use of these

D2 Recruitment And Payment: No

D2 Elaboration: No use of these

D3 Data Consent: Yes

D3 Elaboration: Section.3 and Appendix.C

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: No

E1 Elaboration: No use

Author Submission Checklist: yes

Submission Number: 1423

Loading