Abstract: Large language models (LLMs) like ChatGPT have demonstrated the ability to perform a variety of natural language processing (NLP) tasks.
However, it's unclear whether ChatGPT can serve as a task-oriented dialogue system. In this paper, we evaluate the impact of ChatGPT on task-oriented dialogue (TOD) systems and perform a comprehensive analysis to learn its benefits and challenges. We find that ChatGPT performs well on relatively simple dialogue understanding tasks such as intent detection and slot filling, but fails to understand complex multi-turn conversations and interact with KB in dialogue state tracking and response generation. Future LLM-based TOD work should pay more attention to (1) incorporating domain knowledge (2) understanding complex instructions (3) modeling long-term memory (4) interacting with external knowledge bases. \footnote{We will open-source our code and all the evaluation results after blind review to facilitate future explorations.
Paper Type: long
Research Area: Dialogue and Interactive Systems
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
0 Replies
Loading