Abstract: Fine-tuning on agent-environment interaction trajectory data is of high potential for surfacing generalized agent capabilities in open-source large language models (LLMs). In this work, we curate SuperAgent, by far the largest trajectory tuning data collection featuring more than 50k diverse high-quality interaction trajectories with GPT-annotated rationale. SuperAgent comprises 16 tasks covering five distinct agent skill dimensions. Furthermore, we present Samoyed, a series of open-source LLMs fine-tuned on SuperAgent. Our comparative experiments show that Samoyed outperforms strong baseline LLMs on both held-in and held-out tasks, demonstrating the effectiveness of scaling the interaction trajectory data to acquire generalized agent capabilities. Additional studies also reveal some key observations regarding trajectory tuning and agent skill generalization. To facilitate future research on developing open-source LLM agents, we will release SuperAgent dataset, checkpoints of Samoyed, and the unified evaluation framework.
Paper Type: long
Research Area: Special Theme (conference specific)
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
0 Replies
Loading