Free-Energy Equilibria: Toward a Theory of Interactions Between Boundedly-Rational Agents

Published: 17 Jun 2024, Last Modified: 02 Jul 2024ICML 2024 Workshop MHFAIA PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: bounded rationality, game theory, models of cooperation, AI alignment, free-energy principle, multi-agent systems
TL;DR: A novel free-energy-based game-theoretic framework modelling strategic interactions between boundedly-rational agents in partially observable environments, exploring applications to modelling cooperation and AI alignment.
Abstract: We propose a novel framework for modelling strategic interactions between boundedly-rational agents in complex, partially observable environments. Our approach introduces agents that minimize a free-energy functional, capturing the divergence between their beliefs about future trajectories and their preferences, which are represented by a biased probabilistic model. We extend this to multi-agent settings and introduce Free-Energy Equilibria, a new class of game-theoretic solution concepts. We begin by establishing the relationship between Free-Energy Equilibria and existing game-theoretic solution concepts. Then, we propose an approach to studying cooperation by contrasting Free-Energy Equilibria with joint free-energy minimization, extending key concepts from mechanism design. Our framework allows for modelling interactions between agents with varying levels of rationality and biased or incorrect world models, providing insights into human-AI interaction and AI alignment.
Submission Number: 68
Loading