Are they lovers or friends? Evaluating LLMs’ Social Reasoning in English and Korean Dialogues

Are they lovers or friends? Evaluating LLMs’ Social Reasoning in English and Korean Dialogues

ACL ARR 2026 January Submission7475 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large language models, Social Reasoning

Abstract: As LLMs are increasingly deployed in real-world interactions, their social reasoning in interpersonal situations becomes critical. To explore their capabilities, we introduce SCRTIPS, a 1k-dialogue dataset in English and Korean, sourced from movie scripts. We furthermore propose a social reasoning task based on SCRTIPS that evaluates the capacity of LLMs to infer the social relationships (e.g., friends, sisters, lovers) between speakers in each dialogue. Among nine models' evaluation results, current proprietary LLMs achieve around 75–80% on the English dataset and 58–69% in Korean. Strikingly, models predict relationships labeled as Unlikely by humans in 10–25% of responses in both languages. Furthermore, we find that thinking models and chain-of-thought prompting provide minimal benefits for social reasoning and occasionally amplify social biases. In sum, there are significant limitations in current LLMs' social reasoning capabilities especially for Korean, highlighting the need for efforts to develop socially-aware LLMs.

Paper Type: Long

Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good

Research Area Keywords: language/cultural bias analysis, sociolinguistics

Contribution Types: Model analysis & interpretability, Data resources

Languages Studied: English, Korean

Submission Number: 7475

Loading