Keywords: Sentiment, Sentiment Analysis, Hallucination, Sentiment Drift, Emotion, Agents, Rephrasing, Agentic Workflows
TL;DR: Tracking hallucinations in agentic workflows through sentiment drift.
Abstract: Large language models (LLMs) are increasingly embedded in long-form and agentic workflows, where tonal consistency matters as much as factual accuracy. Although prior work has examined factual hallucinations and demonstrated that they accumulate linearly, we show that emotional hallucinations, artificial generations of exaggerated or incorrect emotions, follow a different dynamic: rather than accumulating linearly, sentiment drift emerges in oscillatory bursts, often correcting or exacerbating itself mid-chain. We introduce SENTINEL, a framework for quantifying sentiment drift via three complementary metrics: Mean Absolute Drift (magnitude of shift), Variance (intra text volatility), and a novel Drift Propagation Index (extent to which drift compounds over steps). Using essays, reviews, and news across five LLMs, we find that drift is domain and model dependent, with negative texts especially prone to "neutralization" over time. Token-level attribution further reveals that a handful of emotionally charged words disproportionately drive drift, suggesting practical levers for mitigation. Our results position emotional hallucination as a distinct phenomenon requiring new interpretability tools, and highlight the risks of unmonitored sentiment drift in high-stakes agentic applications.
Submission Number: 135
Loading