Keywords: Theory of Mind, Agent Modeling, Bayesian Inverse Planning, Large Language Models, Cognitive Modeling
TL;DR: We propose AutoToM, an automated agent modeling method for scalable, robust, and interpretable mental inference.
Abstract: Theory of Mind (ToM), the ability to understand people's minds based on their behavior, is key to developing socially intelligent agents. Current approaches to ToM reasoning either rely on prompting Large Language Models (LLMs), which are prone to systematic errors, or use handcrafted, rigid agent models for model-based inference, which are more robust but fail to generalize across domains. In this work, we introduce *AutoToM*, an automated agent modeling method for scalable, robust, and interpretable mental inference. Given a ToM problem, *AutoToM* first proposes an initial agent model and then performs automated Bayesian inverse planning based on this model, leveraging an LLM backend. Guided by inference uncertainty, it iteratively refines the model by introducing additional mental variables and/or incorporating more timesteps in the context. Across five diverse benchmarks, *AutoToM* outperforms existing ToM methods and even large reasoning models. Additionally, we show that *AutoToM* can produce human‐like confidence estimates and enable online mental inference for embodied decision-making.
Supplementary Material: zip
Primary Area: Neuroscience and cognitive science (e.g., neural coding, brain-computer interfaces)
Submission Number: 21254
Loading