Building Multi-turn Intent Classification with LLM-based Labeling

Published: 29 May 2026, Last Modified: 29 May 2026ACL 2026 Workshop CustomNLP PosterEveryoneRevisionsCC BY 4.0
Keywords: Intent detection, LLM-based annotation, Multi-turn conversation understanding
TL;DR: We propose a scalable framework that uses LLM-assisted labeling to train a low-latency multi-turn intent classifier for real-world customer service dialogues.
Abstract: Intent classification is essential for customer service routing, connecting customers to the appropriate agents and reducing handling time and operational cost. Developing a real-world multi-turn intent classification system is challenging due to complex intent taxonomies, dynamic intent switching within conversations, and limited labeled training data. To address these challenges, we propose a scalable multi-turn intent classification framework for e-commerce customer service that models intent along multiple dimensions. We introduce LLM-based labeling strategies to annotate real customer transcripts at scale and augment training with LLM-simulated multi-turn dialogues that expand coverage of topic and intent switches, which are rare in existing transcripts. Through extensive experiments, we find that explanation-guided labeling with a self-critique step produces the most accurate training labels. Fine-tuned models built on a RoBERTa backbone outperform zero-shot LLM prompting while achieving substantially lower inference latency. Finally, we show that a hybrid approach that combines the fine-tuned classifier with LLM prompting further improves accuracy over either component alone. Overall, our results provide practical guidance for building and deploying high-accuracy, low-latency, large-scale multi-turn intent classification systems.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 19
Loading