Speak \& Spell: LLM-Driven Controllable Phonetic Error Augmentation for Robust Dialogue State Tracking

ACL ARR 2025 July Submission1266 Authors

29 Jul 2025 (modified: 05 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Dialogue State Tracking (DST) is a key part of task-oriented dialogue systems, identifying important information in conversations. However, its accuracy drops significantly in spoken dialogue environments due to named entity errors from Automatic Speech Recognition (ASR) systems. We introduce a simple yet effective data augmentation method that targets those entities to improve the robustness of DST model. Our novel method can control the placement of errors using keyword-highlighted prompts while introducing phonetically similar errors. As a result, our method generated sufficient error patterns on keywords, leading to improved accuracy in noised and low-accuracy ASR environments.
Paper Type: Short
Research Area: Dialogue and Interactive Systems
Research Area Keywords: dialogue state tracking
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 1266
Loading