TajweedWER: A Diacritic-Aware Evaluation Metric for Automatic Speech Recognition in Quranic Arabic

FARUQ AFOLABI OLUWATOBI

TajweedWER: A Diacritic-Aware Evaluation Metric for Automatic Speech Recognition in Quranic Arabic

FARUQ AFOLABI OLUWATOBI

Published: 14 Jun 2026, Last Modified: 21 Jun 2026ICML 2026 Workshop MusIML PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: automatic speech recognition, word error rate, Quranic Arabic, Tajweed, low-resource evaluation, diacritic normalization

TL;DR: TajweedWER diagnoses and fixes a WER saturation failure in Quranic ASR evaluation, revealing Whisper-medium's true 9.5% Base WER versus a naive 100% Surface WER.

Abstract: Standard Word Error Rate (WER) is poorly suited to evaluating ASR systems on Quranic Arabic recitation, because reference transcripts in Uthmani orthography contain dense diacritical marking (tashkeel) and elided-but-pronounced letters that no general-purpose ASR system outputs by default. We show empirically that naive WER computed against a fully vowelised reference saturates near 100% regardless of underlying transcription quality, rendering the metric uninformative. We introduce a corrected evaluation protocol that computes corpus-level rather than per-utterance-averaged WER to avoid denominator artifacts, applies comprehensive Arabic and Quranic-extended diacritic stripping, and restores commonly elided alifs specific to Uthmani orthography before scoring, isolating genuine transcription errors from orthographic convention. Applying this corrected protocol to Whisper-small and Whisper-medium across 8 professional reciters on Surah Al-Fatiha (29 words) and Surah Yasin (730 words), we find Whisper-medium achieves mean Base WER of 10.3% on Al-Fatiha and 9.5% on Yasin, consistently and substantially outperforming Whisper-small (22.0% and 30.3% respectively), with the longer passage producing markedly tighter variance, confirming the corrected metric's stability. We release our normalization pipeline to support more reliable benchmarking of ASR on Quranic and other diacritic-rich scripts.

Track: Track 1: ML Research Addressing Challenges Faced by Muslim Communities

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Non Archival Confirmation: I understand that submissions to MusIML are non-archival and can be submitted to other venues.

Submission Number: 58

Loading