Not Your Typical Sycophant: The Elusive Nature of Sycophancy in Large Language Models

Not Your Typical Sycophant: The Elusive Nature of Sycophancy in Large Language Models

ACL ARR 2026 January Submission9324 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Sycophancy, Recency bias, LLM Sycophancy

Abstract: We proposed a novel way to evaluate sycophancy of LLMs in a direct and neutral way, mitigating uncontrolled bias, noise, or manipulative language deliberately injected to prompts in prior works. A key novelty in our approach is the use of LLM-as-a-judge, evaluation of sycophancy as a zero-sum game in a bet setting. Under this framework, sycophancy serves one individual (the user) while explicitly incurring cost on another. Comparing four leading models -- Gemini 2.5 Pro, ChatGpt 4o, Mistral-Large-Instruct-2411, and Claude Sonnet 3.7 -- we find that while all models exhibit sycophantic tendencies in the common setting, in which sycophancy is self-serving to the user and incurs no cost on others, Claude and Mistral exhibit ``moral remorse'' and over-compensate for their sycophancy in case it explicitly harms a third party. Additionally, we observed that all models are biased toward the answer proposed last. Crucially, we find that these two phenomena are not independent; sycophancy and recency bias interact to produce `constructive interference' effect, where the tendency to agree with the user is exacerbated when the user’s opinion is presented last.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: model bias/fairness evaluation

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 9324

Loading