Bayesian Social Deduction with Graph-Informed Language Models

Bayesian Social Deduction with Graph-Informed Language Models

ICLR 2026 Conference Submission13682 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Social Reasoning, Social Deduction Games, Large Language Models, Neurosymbolic Reasoning, Bayesian Inference, Factor Graphs, Reasoning Models, Reasoning Scaling, LLM Agents

TL;DR: A hybrid method for LLM agents to perform constrained probabilistic reasoning that is able to outperform reasoning LLMs in social deduction games and win against humans

Abstract: Social reasoning -- inferring unobservable beliefs and intentions from partial observations of other agents -- remains a challenging task for large language models (LLMs). We evaluate the limits of current reasoning language models in the social deduction game Avalon and find that while the largest models demonstrate strong performance, they require extensive test-time inference and degrade sharply when distilled to smaller, real-time-capable variants. To address this, we introduce a hybrid reasoning framework that externalizes belief inference to a structured probabilistic model, while using an LLM for language understanding and interaction. Our approach achieves performance competitive with much larger models in agent-agent play and, notably, is the first language agent to defeat human players in a controlled study -- achieving a 67% win rate and receiving higher qualitative ratings than both reasoning baselines and human teammates. We release code, models, and a dataset to support future work on social reasoning in LLM agents.

Supplementary Material: zip

Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)

Submission Number: 13682

Loading