Universal-WER: Enhancing WER with Segmentation and Weighted Substitution for Varied Linguistic Contexts
Abstract: Word Error Rate (WER) is a crucial metric
for evaluating the performance of automatic
speech recognition (ASR) systems. However,
its traditional calculation, based on Levenshtein
distance, does not account for lexical similarity
between words and treats each substitution in a
binary manner, while also ignoring segmentation errors.
This paper proposes an improvement to WER
by introducing a weighted substitution method,
based on lexical similarity measures, and incorporating splitting and merging operations to
better handle segmentation errors.
Unlike other WER variants, our approach is easily integrable and generalizable to various languages, providing a more nuanced and accurate
evaluation of ASR transcriptions, particularly
for morphologically complex or low-resource
languages.
Loading