Low-Hanging Fruit: Knowledge Distillation from Noisy Teachers for Open Domain Spoken Language Understanding
Abstract: Spoken Language Understanding (SLU) plays an integral role in dialogue systems. However, conventional SLU relies heavily on manually annotated datasets, which are impractical for open-domain SLU, given the wide variety of topics that must be considered. As the dataset grows exponentially, significant costs are inevitably incurred in achieving open-domain SLU. The Noisy Teacher and Consistently Guiding Student (NTCG) Paradigm is proposed to address these challenges. The objective is first to develop a prompt that effectively extracts valuable knowledge from large language models (LLMs), which can occasionally generate inconsistent and random responses, acting as ‘noisy teachers.’ This refined knowledge is then imparted to the downstream task model to improve performance further. To this end, we introduce an Incremental Progress Prompting Scheme (IPPS) under the NTCG that employs prompting techniques to generate more reliable annotations for unlabelled OD-SLU data, thereby fostering “Consistently Guiding Students”. Initially, IPPS aims to solve the straightforward intent prediction task in OD-SLU using self-ranked prompting, enhancing LLMs precision using similar examples from a small, clean set as contextual hints for a given query. Additionally, the Intersection Sample Selection method is utilised to identify consistently predicted samples across different levels of randomness in ChatGPT, further improving its accuracy. The Consistent Intent Slot Prompting (CISP) method is proposed by exploiting the intent-to-slot correlation matrix to boost accuracy and precision for the more complex slot-filling task. Finally, the proposed Positively Fine-Tuned Scheme (PFTS) incorporates distilled knowledge from consistent samples via Label Consistency Regularisation to enhance downstream model performance. These strategies significantly improve intent detection and slot filling for prompt-based learning and downstream tasks.
Loading