NexusRaven: a commercially-permissive Language Model for function calling

Venkat Krishna Srinivasan; Zhen Dong; Banghua Zhu; Brian Yu; Hanzi Mao; Damon Mosk-Aoyama; Kurt Keutzer; Jiantao Jiao; Jian Zhang

NexusRaven: a commercially-permissive Language Model for function calling

Venkat Krishna Srinivasan, Zhen Dong, Banghua Zhu, Brian Yu, Hanzi Mao, Damon Mosk-Aoyama, Kurt Keutzer, Jiantao Jiao, Jian Zhang

Published: 28 Oct 2023, Last Modified: 26 Nov 2023Instruction Workshop @ NeurIPS 2023EveryoneRevisionsBibTeX

Keywords: instruction tuning, synthetic data generation, large language models, tool use, function call, API, software manipulation, decision making

Abstract: The rise of open-source, commercially permissive large language models (LLMs) is revolutionizing generative AI, presenting organizations with enhanced control, minimized data risks, and cost benefits compared to proprietary models. However, in the field of tool use and function-calling LLMs, many open-source models, such as Gorilla and ToolLLAMA, are dependent on proprietary LLMs like GPT-4 for high-quality training data, which often faces legal restrictions for competitive commercial applications. In this paper, we introduce NexusRaven-13B, an open-source LLM designed for function calls. Originating from the CodeLLAMA-13B lineage, NexusRaven-13B employs a unique data curation via multi-step refinement, ensuring high-quality training data without relying on GPT-4 distillation. NexusRaven-13B matches GPT-3.5 in zero-shot function-calling accuracy. When combined with our second core technique, demonstration retrieval augmentation, its performance significantly surpasses GPT-4. The code, model, and demo will be available after the review process.

Submission Number: 71

Loading