PDDL-Mind: Reliable State Tracking is All You Need for Theory-of-Mind Benchmarks

ACL ARR 2026 January Submission7063 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Theory-of-Mind, State Tracking, LLM, Neuro-Symbolic
Abstract: Large language models (LLMs) perform substantially below human level on existing theory-of-mind (ToM) benchmarks, even when augmented with chain-of-thought prompting or probabilistic belief updates. We argue that these failures primarily arise from unreliable implicit state tracking rather than limitations in high-level reasoning. We introduce PDDL-Mind, a neuro-symbolic framework that decouples environment state evolution from belief inference. By translating narrative descriptions into explicit states and actions expressed in Planning Domain Definition Language (PDDL), and by verifying action-induced state transitions against a predefined domain, PDDL-Mind provides LLMs with a logically consistent and explicit representation of world states for ToM tasks. Experiments on MMToM-QA and MuMA-ToM show that PDDL-Mind achieves over 5% absolute accuracy gain over the best existing state-of-the-art method on ToM benchmark questions.
Paper Type: Short
Research Area: Discourse, Pragmatics, and Reasoning
Research Area Keywords: pragmatic inference and reasoning
Contribution Types: NLP engineering experiment, Data analysis
Languages Studied: English
Submission Number: 7063
Loading