Building Data Framework and Shifting Perspectives for First-Person Exploration of Social Intelligence in LLMs
Keywords: First-Person Perspective, Social Intelligence, Data Framework, Human-Machine Interactive Dialogue
Abstract: Social intelligence is built upon three foundational pillars: cognitive, situational, and behavioral intelligence. As Large Language Models (LLMs) are increasingly integrated into our social lives, understanding, evaluating, and developing their social intelligence are becoming important. While multiple works have investigated the social intelligence of LLMs: (1) most focus on a single pillar, while a comprehensive framework for organizing and studying the social intelligence of LLMs remains underdeveloped; (2) position LLMs as passive observers from a third-person perspective. Compared to the third-person perspective, ego-centric first-person perspective evaluation can align well with actual LLM-based Agent use scenarios; (3) a lack of comprehensive evaluation of behavioral intelligence, with specific emphasis on a more intuitive comparison of behavioral differences between humans and LLMs. In light of these, we introduce the EgoSocialArena framework, built upon the three foundational pillars of social intelligence - cognitive, situational, and behavioral intelligence, with each pillar supported by novel and systematic evaluation design. Using EgoSocialArena, we conduct a comprehensive evaluation of fourteen foundation models and investigate several important questions, including the social intelligence performance of Large Reasoning Models, limitations of existing social intelligence evaluation frameworks in interactive dialogue settings, and whether perspective shift can elicit social capabilities similar to Chain-of-Thought elicit math capabilities.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 24145
Loading