Constant-Memory Strategies in Stochastic Games: Best Responses and Equilibria

Published: 19 Dec 2025, Last Modified: 05 Jan 2026AAMAS 2026 ExtendedAbstractEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Stochastic games, Bounded rationality, Best response, Restricted memory, Reinforcement learning
TL;DR: Investigated the concept and properties of constant-memory strategies in stochastic games, and verified the theoretical results in several testbeds of sequential social dilemma.
Abstract: Stochastic games have become a prevalent framework for studying long-term multi-agent interactions, especially in the context of multi-agent reinforcement learning. In this work, we comprehensively investigate the concept of constant-memory strategies in stochastic games. We first establish some results on best responses and Nash equilibria for behavioral constant-memory strategies, followed by a discussion on the computational hardness of best responding to mixed constant-memory strategies. Those theoretic insights are later verified on several sequential decision-making testbeds, including the *Iterated Prisoner's Dilemma*, the *Iterated Traveler's Dilemma*, and the *Pursuit* domain. This work aims to enhance the understanding of theoretical issues in single-agent planning under multi-agent systems, and uncover the connection between decision models in single-agent and multi-agent contexts. *The code is provided in the supplementary material, and will be open-sourced upon the acceptance of this paper.*
Area: Game Theory and Economic Paradigms (GTEP)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 748
Loading