Abstract: In the last couple years, there has been a flood of interest in studying the extent to which language models (LMs) have a theory of mind (ToM) -- the ability to ascribe mental states to themselves and others. The results provide an unclear picture of the current state of the art, with some finding near-human performance and others near-zero. To make sense of this landscape, we perform a survey of 15 recent studies aimed at measuring ToM in LMs and find that, while almost all perform checks for human identifiable issues, less than half do so for patterns only a machine might exploit. Among those which do perform such validation, none identify LMs to exceed human performance. We conclude that these datasets are easier than their peers, likely due to the presence of spurious patterns in the data, and we caution against building ToM benchmarks relying solely on human validation of the data.
Paper Type: Short
Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics
Research Area Keywords: Cognitive Modeling
Contribution Types: Reproduction study, Data analysis, Position papers, Surveys
Languages Studied: English
Submission Number: 4525
Loading