Keywords: named entity recognition, automatic speech recognition, pre-trained models
TL;DR: We confirmed pre-trained models, i.e., BERT, ELECTRA, and T5 could extract named entities from Japanese ASR text.
Abstract: This paper details our study on Japanese Named Entity Recognition (NER) from Automatic Speech Recognition (ASR) results, which frequently contain speech recognition errors and unknown named entities due to abbreviations and aliases. One possible solution to this problem is to use a pre-trained model trained on a large quantity of text to acquire various contextual information. In this study, we performed NER on the dialogue logs of a task-oriented dialogue system on road traffic information in Fukui, Japan, using pre-trained BERT-based models and T5. The results confirmed that these pre-trained models exhibited significantly higher accuracies on unseen entities than methods based on dictionary matching.
0 Replies
Loading