FARS: FSM-Augmentation to Make LLMs Hallucinate the Right APIs

19 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: LLM, API Prediction, Finite State Machine, Tries, Constrained Decoding
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We propose a way to ground an LLM to generate the right APIs by grounding the generation to an available set of APIs using a finite state machine-based constrained decoding algorithm.
Abstract: Large Language Models (LLMs) have shown remarkable ability to converse with humans and solve a wide range of tasks. They have also been extended to make use of external tools or services through API calls. This is commonly achieved by fine-tuning the model, or with the use of in-context learning, where instructions and descriptions of those external APIs, along with examples of how to call them, are given to the LLM via its prompt. Given the limited context available in the LLM prompt and other latency constraints, scaling up to a large number of tools is challenging and requires the help of an external shortlisting process to prepare instructions and examples from a large number of APIs to a smaller set of relevant ones. In this work, we propose a new way for an LLM to generate the right API calls without the need to shortlist instructions or examples. Rather, we do this by allowing the LLM to hallucinate meaningful output while grounding the generation to an available set of APIs using a finite state machine-based constrained decoding algorithm. We call our approach FARS (FSM-Augmentation to make LLMs hallucinate the Right APIS). FARS allows us to ground LLMs to a large set of APIs with semantically meaningful names without using an external retriever or exemplars. We also demonstrate that with FARS, LLMs can seamlessly switch between conversation and API calling during multi-turn dialogs. We show that this can be achieved without any additional fine-tuning over the standard instruction tuning typically performed to train LLMs. This allows us to pave the way to build a truly powerful AI assistant using LLMs. We demonstrate the effectiveness of FARS for API calling on two public task-oriented API datasets: SNIPS and MultiWOZ, and a very challenging in-house Smart Home Control dataset.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2019
Loading