Keywords: theory of mind, multi-agent reasoning, LLM benchmark, zero-shot coordination
TL;DR: We introduce Decrypto, a language-based benchmark for LLMs to assess multi-agent reasoning and theory of mind.
Abstract: We propose Decrypto, a novel interactive benchmark for evaluating coordination, competition, and theory of mind (ToM) reasoning capabilities in agentic, foundational AI models. Existing benchmarks often suffer from data leakage, saturation, and lack of interactivity, making it hard to measure the ability of intelligent systems to model other agents' reasoning. To overcome or alleviate these limitations, we introduce Decrypto, a multi-agent benchmark based on a popular, language-based board game and designed to be future-proof for large language models (LLMs). We validate Decrypto's effectiveness through comprehensive empirical evaluations of frontier LLMs, ablation studies, and human-AI cross-play experiments. We show that LLMs do not coordinate well with other LLMs or humans and perform strictly worse than the latter. Specifically, LLMs struggle to reason about the choices of others, even if they use the same underlying model, pointing to a fundamental limitation of current systems.
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9743
Loading