Track: Sociotechnical
Keywords: Benchmark, Large language models, Evaluation
Abstract: Do large language models (LLMs) exhibit forms of “self-understanding” similar to those of humans? In this paper, we explore this question through the lens of awareness and introduce AwareBench as an evaluation benchmark. Drawing from theories in psychology and philosophy, we view awareness in LLMs as the ability to understand themselves as AI models and to exhibit social intelligence. Subsequently, we categorize awareness in LLMs into five dimensions, including capability, mission, emotion, culture, and perspective. Based on this taxonomy, we create a dataset called AwareEval, which contains binary, multiple-choice, and open-ended questions to assess LLMs' understandings of specific awareness dimensions. Our experiments, conducted on 13 LLMs, reveal that the majority of them struggle to fully recognize their capabilities and missions while demonstrating decent social intelligence. We conclude by connecting awareness of LLMs with AI alignment and safety, emphasizing its significance to the trustworthy and ethical development of LLMs.
Submission Number: 48
Loading