Navigating the Unknown: Intent Classification and Out-of-Distribution Detection Using Large Language Models

Navigating the Unknown: Intent Classification and Out-of-Distribution Detection Using Large Language Models

ACL ARR 2025 May Submission805 Authors

15 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Out-of-Distribution (OOD) detection is a challenging task that requires great generalization capability for the practicality and safety of task-oriented dialogue systems (TODS). With the dawn of large language models (LLMs), their enhanced ability to handle diverse patterns and contexts may aid in addressing this challenging task. In this paper, we investigate the current performance of LLMs in the near-OOD setting, where OOD queries belong to the same domain but different intents. To take advantage of out-of-the-shelf capabilities of LLMs, we do not use fine-tuning. We study the performance of one of the leading frontier models, GPT-4o, in $3$ well-known public datasets and $3$ in-house datasets, using $10$ different methods and prompt variations. We study the performance of different prompts and techniques in Gemini 1.5 Flash and Llama 3.1-70b. We investigate the effect of increasing the number of In-Distribution (ID) intents. We propose a novel hybrid method that is cost-efficient, high-performing, highly robust, and versatile enough to be used with smaller LLMs without sacrificing performance. This is achieved by combining ID success of smaller text classification models and high generalization capabilities of LLMs in OOD detection.

Paper Type: Long

Research Area: Machine Learning for NLP

Research Area Keywords: spoken dialogue systems, task-oriented, generative models, few-shot learning, generalization

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 805

Loading