Building Data Framework and Shifting Perspectives for First-Person Exploration of Social Intelligence in LLMs

Guiyang Hou; Xiang Huang; Yihui Fu; Zeqi Tan; Wenqi Zhang; Weiming Lu

Building Data Framework and Shifting Perspectives for First-Person Exploration of Social Intelligence in LLMs

Guiyang Hou, Xiang Huang, Yihui Fu, Zeqi Tan, Wenqi Zhang, Weiming Lu

20 Sept 2025 (modified: 04 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: First-Person Perspective, Social Intelligence, Data Framework, Human-Machine Interactive Dialogue

Abstract: Social intelligence is built upon three foundational pillars: cognitive, situational, and behavioral intelligence. As Large Language Models (LLMs) are increasingly integrated into our social lives, understanding, evaluating, and developing their social intelligence are becoming important. While multiple works have investigated the social intelligence of LLMs: (1) most focus on a single pillar, while a comprehensive framework for organizing and studying the social intelligence of LLMs remains underdeveloped; (2) position LLMs as passive observers from a third-person perspective. Compared to the third-person perspective, ego-centric first-person perspective evaluation can align well with actual LLM-based Agent use scenarios; (3) a lack of comprehensive evaluation of behavioral intelligence, with specific emphasis on a more intuitive comparison of behavioral differences between humans and LLMs. In light of these, we introduce the EgoSocialArena framework, built upon the three foundational pillars of social intelligence - cognitive, situational, and behavioral intelligence, with each pillar supported by novel and systematic evaluation design. Using EgoSocialArena, we conduct a comprehensive evaluation of fourteen foundation models and investigate several important questions, including the social intelligence performance of Large Reasoning Models, limitations of existing social intelligence evaluation frameworks in interactive dialogue settings, and whether perspective shift can elicit social capabilities similar to Chain-of-Thought elicit math capabilities.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 24145

Loading