Probing the Robustness of Theory of Mind in Large Language Models

Published: 06 Oct 2024, Last Modified: 12 Nov 2024WiNLP 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Theory of Mind, Benchmark, Robustness
TL;DR: We use a novel dataset of false belief task to evaluate the robustness of LLM's ToM capabilites when applying complications, showing none of the evaluated models exhibit robust ToM and providing valuable insights into their limitations.
Abstract: Theory of Mind (ToM) is considered essential in understanding the intentions and beliefs of others. Recent advancements in large language models (LLMs) like ChatGPT have sparked claims that these models exhibit ToM capabilities. However, follow-up studies reveal that these capabilities vanish with slight task variations. This paper introduces a novel dataset comprising 68 tasks across 10 complexity classes, probing ToM in four open-source LLMs. Our results show that ToM abilities in these models are still limited. We highlight challenges and suggest future research directions.
Submission Number: 56
Loading