Incorporating Pinyin into Pipeline Named Entity Recognition from Chinese Speech

Published: 01 Jan 2023, Last Modified: 13 Nov 2024APSIPA ASC 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Named Entity Recognition (NER) from speech is usually implemented through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (ASR) system and (2) applying an NER tagger to the ASR output. In this paper, we incorporate pinyin 1 — spelled sounds of Chinese characters — into the pipeline NER from Chinese speech, aiming to improve the NER performance through two steps. First, we take the pretrained model ChineseBERT to embed pinyin, as well as glyph and char, as the input of NER tagger. Second, we introduce homophone noises into training data for NER tagger, as homophone errors most likely exist in ASR output for Chinese speech. Using the two-step pipeline method with pinyin incorporated into the NER tagger, the F1 score is improved by nearly 1% absolute points in the experiment on the AISHELL-NER dataset, which is a significant improvement in the field of NER. And the F1 score outperforms the current state-of-the-art (SOTA) result on the AISHELL-NER dataset by 0.4% absolute points, despite the slightly worse Character Error Rate (CER) of our ASR.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview