TL;DR: We propose SyncTOD model that improves performances of end-to-end task-oriented dialogs by leveraging large language models.
Abstract: This work explores the effectiveness of large language models (LLMs) for end-to-end task-oriented dialog systems. We evaluate Llama2, ChatGPT, and gpt-4 in the few-shot (in-context) setting on two end-to-end TOD datasets and find that their performance is not on par with the existing SoTA models. We posit that, unlike the SoTA models, LLM responses do not align well with the training data due to their limited context size. In response, we propose SyncTOD, which synergizes LLMs with useful hints about the task for improved alignment. At a high level, SyncTOD uses the auxiliary models to provide these hints and exemplar selection for the in-context prompts. With gpt4, SyncTOD outperforms SoTA models on MultiWOZ and SMD datasets. Further, SyncTOD achieves superior performance compared to LLMs and SoTA models in low-data settings while retaining competitive performance in full-data settings.
Paper Type: short
Research Area: Dialogue and Interactive Systems
Contribution Types: NLP engineering experiment
Languages Studied: English
0 Replies
Loading