Abstract: We propose \textsc{Magnet}, a principled approach to synthesizing high-quality training trajectories to enhance the function calling capability of large language model agents in multi-turn conversations with humans. To reflect the complicated function interactions in multi-turn cases, we take a graph perspective and design novel node operations to build reliable signature path of functions. We iterative transform the signature path into pairs of multi-turn user queries and executable FCs. Motivated by context distillation, we propose to leverage the pairs of query-FC to sample positive trajectories from a teacher model with the references as context and negative trajectories that contrast with the positive ones on targeted error types. Experiments show that our 14B model trained with the positive trajectories with supervised fine-tuning and preference optimization against negative trajectories, \textsc{Magnet}-14B-mDPO, obtains 68.01 on BFCL-v3 and 73.30 on ToolQuery, surpassing the performance of the teacher model Gemini-1.5-pro-002 by a large margin.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: LLM, data synthesis, LLM Agent, Tool-use, Post-training
Languages Studied: English
Submission Number: 5192
Loading