Towards Predictive Models of Strategic Behaviour in Large Language Model Agents

Published: 02 Mar 2026, Last Modified: 09 Mar 2026ICLR 2026 Workshop AIMSEveryoneRevisionsCC BY 4.0
Keywords: Large Language Models, Game Theory, Strategic Decision Making, Multi-Agent Systems, Mechanism Design, Behavioural Economics, AI Safety, Cooperation
Abstract: Large language models (LLMs) are increasingly deployed as autonomous agents in settings involving cooperation, competition, and coordination, yet current behavioural evaluations provide limited guidance for anticipating risks in deployment. We present a large-scale study of strategic decision-making across seven frontier models, analysing over 200,000 decisions in game-theoretic scenarios. Using controlled experiments, we found that apparent self-recognition effects operate through inferred policy correlation rather than identity; a correlated stranger elicits cooperation equivalent to a correlated self. We further observe substantial heterogeneity across model families, including opposite responses to identical "rationality" instructions, which one might use to steer agent behaviour, and marked differences in forgiveness and exploitation dynamics in iterated interactions. Finally, we introduce a lightweight prediction framework that requires only 5-10 calibration scenarios and achieves up to R²=0.70 when forecasting held-out model behaviour. These results demonstrate that systematic behavioural evaluation of LLMs can support pre-deployment risk assessment and shed light on AI agent decision-making in strategic situations.
Track: Long Paper
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 117
Loading