PHAnToM: Persona-Based Prompting Has an Effect on Theory-of-Mind Reasoning in Large Language Models

Published: 2025, Last Modified: 21 Jan 2026ICWSM 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The use of LLMs in natural language reasoning has shown mixed results, sometimes rivaling or even surpassing human performance in simpler classification tasks while struggling with social-cognitive reasoning, a domain where humans naturally excel. These differences have been attributed to many factors, such as variations in prompting and the specific LLMs used. However, no reasons appear conclusive, and no clear mechanisms have been established in prior work. In this study, we empirically evaluate how role-playing persona-based prompting influences Theory-of-Mind (ToM) reasoning capabilities. Grounding our research in psychological theory, we found that, beyond the inherent variance in the complexity of reasoning tasks, ToM performance differences arise because of socially-motivated prompting differences. In an era where prompt engineering with role-play is a typical approach to adapt LLMs to new contexts, our research advocates caution as models that adopt specific personas might potentially result in errors in social-cognitive reasoning.
Loading