The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind

Published: 30 Apr 2026, Last Modified: 24 Jun 2026ICML 2026 regularEveryoneRevisionsBibTeXCC BY-NC 4.0
TL;DR: We introduce Decrypto, an interactive language-based benchmark for LLMs to evaluate multi-agent reasoning and theory of mind.
Abstract: As Large Language Models (LLMs) gain agentic abilities, they will have to navigate complex multi-agent scenarios, interacting with human users and other agents in cooperative and competitive settings. This will require new reasoning skills, a crucial one being theory of mind (ToM), or the ability to reason about the ``mental'' states of other agents. However, ToM and other multi-agent abilities in LLMs are poorly understood, since existing benchmarks suffer from narrow scope, data leakage, saturation, and lack of interactivity. We thus propose Decrypto, a game-based benchmark for multi-agent reasoning and ToM drawing inspiration from cognitive science, computational pragmatics and multi-agent reinforcement learning. It is designed to be as easy as possible in all other dimensions, eliminating confounding factors common in other benchmarks. To our knowledge, it is also the first platform that isolates ToM evaluation in an interactive setting. We validate the benchmark design through comprehensive empirical evaluations of frontier LLMs, robustness studies, and human-AI cross-play experiments. We find that LLMs lag behind humans and simple word-embedding baselines on key game metrics. We then create variants of two classic cognitive science experiments within Decrypto to evaluate three distinct ToM abilities. Surprisingly, our results show that state-of-the-art reasoning models are significantly worse at those tasks than their older counterparts. This demonstrates that Decrypto addresses a crucial gap in current reasoning and ToM evaluations, and paves the path towards better artificial agents. Code at https://github.com/facebookresearch/decrypto.
Lay Summary: Theory of mind (ToM) is the ability to reason about other agents’ minds, including their beliefs, intentions, and abilities. This is a core human capability and forms the basis of collaboration and social interaction. However, ToM remains poorly understood in large language models (LLMs), especially in interactive settings. We introduce an interactive ToM benchmark based on the popular board game Decrypto and show that LLMs struggle with the game in surprising ways. To investigate why, we adapt two seminal experiments from cognitive science and find that LLMs exhibit ToM gaps analogous to those observed in children under the age of five. Our work advances the understanding of ToM in LLMs and paves the way for more collaborative AI agents that better understand both themselves and humans.
Originally Submitted Supplementary Material: zip
Link To Code: https://github.com/facebookresearch/decrypto
Primary Area: Deep Learning->Large Language Models
Keywords: theory of mind, pragmatics, multi-agent reasoning, interactive evaluation of LLM agents
Originally Submitted PDF: pdf
Submission Number: 25409
Loading