Improving Hallucination Detection in Dialog via Social Framing Analysis

Parisa Rabbani; Dilek Hakkani-Tür

Improving Hallucination Detection in Dialog via Social Framing Analysis

Parisa Rabbani, Dilek Hakkani-Tür

Published: 28 Apr 2026, Last Modified: 28 Apr 2026MSLD 2026 PosterEveryoneRevisionsCC BY 4.0

Keywords: hallucination detection, dialogue systems, social framing, calibration, LLM evaluation

TL;DR: Replacing human speaker labels with AI identifiers in dialog hallucination detection improves calibration by 53% and helps in knowledge-grounded domains, but hurts coherence tracking in chit-chat.

Abstract: Hallucination detection in dialogue is harder than in single-turn settings due to speaker identity, multi-turn context, and conversational framing. We hypothesize that social framing drives much of this difficulty, building on prior work showing that human-vs-AI speaker attribution shifts LLM factual judgments by 17.7pp. We evaluate a dehumanization intervention (replacing speaker labels with AI identifiers) on the DiaHalu benchmark (N=1,099) using GPT-5 Nano. While the overall effect is not statistically significant (McNemar p=.149), domain-level analysis reveals that dehumanization improves every metric in knowledge-grounded dialog (+2.4 F1, +2.5 Acc) while introducing tradeoffs in chit-chat where speaker identity is needed for coherence tracking. The clearest gain is in calibration: expected calibration error halves from .027 to .013, with the largest improvement at low confidence (+29pp). Baseline confidence also predicts which samples are vulnerable to framing effects, with flipped samples showing 3-8pp lower confidence than stable ones across all domains.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 186

Loading