Keywords: Large Language Models, Evaluation, Competitive Game Theory
TL;DR: We propose to use competitive economics games to evaluation the rationality degree and strategic reasoning ability of agents based on large language models.
Abstract: Recent economics literature suggest that large language models (LLMs) are capable of playing various types of economics games. Following these works, we propose to explore competitive games as an evaluation for LLMs to incorporate multiplayers and dynamicise the environment. In our experiments, we find that most of LLMs are rational in their strategies that can increase their payoffs, but not as rational as indicated by Nash Equilibria (NEs). Moreover, when game history are available, certain types of LLMs, such as GPT4, can converge faster to the NE strategies, which suggests higher rationality level in comparison to other models. In the meantime, certain types of LLMs can win more often when game history are available, and we argue that the winning rate reflects the reasoning ability with respect to the strategies of other players. In this work, we provide an economics arena for the LLMs research community as a dynamic simulation to test the rationality and the strategic reasoning abilities of LLMs.
Submission Number: 7
Loading