Psychologically Potent, Computationally Invisible: LLMs Generate Social-Comparison Triggers They Fail to Detect

ACL ARR 2026 January Submission5967 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generation-detection dissociation, Social Comparison, Comutational social science, Chinese social media, Pragmatic & implicit relational meaning, LLM evaluation, Benchmark dataset
Abstract: Social-comparison cues on lifestyle platforms are relational and often implied rather than explicitly comparative, yet they can shape readers’ perceived standing and affect. We introduce XHS-SCoRE, a reader-grounded benchmark for detecting whether a text-only Xiaohongshu post elicits upward (poster better off), downward (worse off), or no/neutral comparison. Across multiple prompted LLMs as zero-shot classifiers, directional reliability drops sharply relative to in-domain fine-tuned Chinese encoders. The errors are structured: LLMs frequently neutralize comparison-triggering posts and, in some cases, systematically skew toward upward readings, with particularly weak sensitivity to downward cues. To validate that the construct is behaviorally real rather than an annotation artifact, we generate platform-style stimuli under corpus-derived constraints and show in a controlled lab study ($N{=}29$) that the generated posts reliably shift perceived standing and comparison-related emotion. Together, the results demonstrate a generation–detection dissociation: LLMs can produce psychologically potent comparison cues while computationally unreliable at detecting the same relational meaning, posing a blind spot for measurement, auditing, and platform governance.
Paper Type: Long
Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good
Research Area Keywords: human behavior analysis, psycho-demographic trait prediction, emotion detection and analysis, NLP tools for social analysis
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Data analysis
Languages Studied: Mandarin Chinese
Submission Number: 5967
Loading