TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

ACL ARR 2025 May Submission108 Authors

07 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) achieve remarkable advancements by leveraging tools to interact with environments, a critical step toward generalized AI. However, the standard supervised fine-tuning (SFT) approach, which relies on large-scale datasets, often overlooks task-specific characteristics in tool use, leading to performance bottlenecks. To address this issue, we analyze three existing LLMs and uncover key insights: training data can inadvertently impede tool-use behavior, token importance is distributed unevenly, and errors in tool calls fall into a small set of categories. Building on these findings, we propose *TL-Training*, a task-feature-based framework that mitigates the effects of suboptimal training data, dynamically adjusts token weights to prioritize key tokens during SFT, and incorporates a robust reward mechanism tailored to error categories, optimized through proximal policy optimization. We validate TL-Training by training CodeLLaMA-2-7B and evaluating it on four open-source test sets. Our results demonstrate that the LLM trained by our method matches or surpasses both open- and closed-source LLMs in tool-use performance using only 1,217 training data points. Additionally, our method enhances robustness in noisy environments and improves general task performance, offering a scalable and efficient paradigm for tool-use training in LLMs.

Paper Type: Long

Research Area: Generation

Research Area Keywords: domain adaptation, interactive and collaborative generation, text-to-text generation, automatic evaluation

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models, Data resources

Languages Studied: English, Chinese

Submission Number: 108

Loading