Abstract: Soundbite identification in broadcast news is important for locating information useful for question answering, mining opinions of a particular person, and enriching speech recognition output with quotation marks. This paper presents a systematic study of this problem under a classification framework, including problem formulation for classification, feature extraction, and the effect of using automatic speech recognition (ASR) output and automatic sentence boundary detection. Our experiments on a Mandarin broadcast news speech corpus show that the three-way classification framework outperforms the binary classification. The entropy-based feature weighting method generally performs better than others. Using ASR output degrades system performance, with more degradation observed from using automatic sentence segmentation than speech recognition errors for this task, especially on the recall rate.
0 Replies
Loading