HUMAIN at IslamicEval 2025 Shared Task 1: A Three-Stage LLM-Based Pipeline for Detecting and Correcting Hallucinations in Quran and Hadith

Arwa Omayrah; Sakhar Alkhereyf; Ahmed Abdelali; Abdulmohsen Al-Thubaity; Jeril Kuriakose; Ibrahim AbdulMajeed

HUMAIN at IslamicEval 2025 Shared Task 1: A Three-Stage LLM-Based Pipeline for Detecting and Correcting Hallucinations in Quran and Hadith

Arwa Omayrah, Sakhar Alkhereyf, Ahmed Abdelali, Abdulmohsen Al-Thubaity, Jeril Kuriakose, Ibrahim AbdulMajeed

Published: 11 Sept 2025, Last Modified: 21 Sept 2025IslamicEval @ ArabicNLP 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: llm hallucination, religious text, quranic text

Abstract: This paper presents HUMAIN’s submission to the IslamicEval 2025 Shared Task 1, addressing hallucination detection and correction in Quranic and Hadith LLM-generated content. Our three-stage pipeline covers: (1) Span Detection via sequence-to-sequence annotation using TANL-style markup, (2) Validation with retrieval-based similarity and substring matching against reference corpora, and (3) Correction through exact matching, LCS alignment, and semantic re-ranking. On the official test set, our system achieved 87.2\% F-1 for span detection, 86.1\% accuracy for validation, and 68.2\% accuracy for correction. While systematic detection is highly achievable, meaningful correction remains limited by semantic complexity where small textual differences can significantly impact religious understanding. This work presents a multi-stage LLM-based pipeline for Islamic content verification.

Submission Number: 4

Loading