Abstract: Utterance domain classification (UDC) is a critical pre-processing step for many speech understanding and dialogue systems. Recently neural models have shown promising results on text classification. Meanwhile, the background information and knowledge beyond the utterance plays crucial roles in utterance comprehension. However, some improper background information and knowledge are easily introduced due to the ambiguity of entities or the noise in knowledge bases (KBs), UDC task remains a great challenge. To address this issue, this paper proposes a knowledge-gated (K-Gated) mechanism that leverages domain knowledge from external sources to control the path through which information flows in the neural network. We employ it with pre-trained token embedding from Bidirectional Encoder Representation from Transformers (BERT) into a wide spectrum of state-of-the-art neural text classification models. Experiments on the SMP-ECDT benchmark corpus show that the proposed method achieves a strong and robust performance regardless of the quality of the encoder models.
0 Replies
Loading