Abstract: Although LLM-based conversational agents demonstrate strong fluency and coherence, they continue to exhibit behavioral errors, such as inconsistencies and factual inaccuracies. Detecting and mitigating these errors is critical for developing trustworthy systems. However, current response correction methods rely heavily on large language models (LLMs), which require information about the nature of an error or hints about its occurrence for accurate detection. This limits their ability to identify errors not defined in their instructions or covered by external tools, such as those arising from updates to the response-generation model or shifts in user behavior. In this work, we introduce $\textbf{Automated Error Discovery}$, a framework for detecting and defining behavioral errors in conversational AI, and propose $\textbf{SEEED}$ ($\underline{S}$oft-clustering $\underline{E}$xtended $\underline{E}$ncoder-Based $\underline{E}$rror $\underline{D}$etection), an encoder-based alternative to LLMs for error detection. We enhance the Soft Nearest Neighbor Loss by amplifying distance weighting for negative samples and introduce $\textbf{Label-Based Sample Ranking}$ to select highly contrastive examples for better representation learning. SEEED outperforms adapted baselines across multiple error-annotated dialogue datasets, improving the accuracy for detecting novel behavioral errors by up to 8 points and demonstrating strong generalization to unknown intent detection.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: applications, behavioral error, behavioral analysis, error detection, contrastive learning
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 396
Loading