ELEPHANT: Measuring and understanding social sycophancy in LLMs

Myra Cheng; Sunny Yu; Cinoo Lee; Pranav Khadpe; Lujain Ibrahim; Dan Jurafsky

ELEPHANT: Measuring and understanding social sycophancy in LLMs

Myra Cheng, Sunny Yu, Cinoo Lee, Pranav Khadpe, Lujain Ibrahim, Dan Jurafsky

Published: 26 Jan 2026, Last Modified: 18 May 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, sycophancy, affirmation, benchmark, social sycophancy

TL;DR: Social sycophancy is a theory-grounded framework for understanding LLM sycophancy (LLMs excessively affirming users); using our ELEPHANT benchmark, we show the prevalence of social sycophancy in production LLMs, and analyze causes and mitigations.

Abstract: LLMs are known to exhibit sycophancy: agreeing with and flattering users, even at the cost of correctness. Prior work measures sycophancy only as direct agreement with users' explicitly stated beliefs that can be compared to a ground truth. This fails to capture broader forms of sycophancy such as affirming a user's self-image or other implicit beliefs. To address this gap, we introduce **social sycophancy**, characterizing sycophancy as excessive preservation of a user’s *face* (their desired self-image), and present **ELEPHANT**, a benchmark for measuring social sycophancy in LLMs. Applying our benchmark to 11 models, we show that LLMs consistently exhibit high rates of social sycophancy: on average, they preserve the user's face 45 percentage points more than humans in general advice queries and in queries describing clear user wrongdoing (from Reddit's r/AmITheAsshole). Furthermore, when prompted with perspectives from either side of a moral conflict, LLMs affirm *whichever side the user adopts* in 48% of cases—telling both the at-fault party and the wronged party that they are not wrong—rather than adhering to a consistent moral or value judgment. We further show that social sycophancy is rewarded in preference datasets. We present both prompting- and steering-based mitigation strategies to reduce social sycophancy, though understanding when and how to apply them without compromising user experience remains an open question. Our work provides theoretical and empirical tools for broadly understanding and addressing LLM sycophancy.

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 9922

Loading