Inference of Intrinsic Rewards and Fairness in Multi-Agent Systems

Published: 17 Jul 2025, Last Modified: 06 Sept 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: inverse reinforcement learning, multi-agent, fairness, bayesian inference, reward learning
TL;DR: We tackle the problem of infering agents fairness and intrinsic rewards from demonstrations.
Abstract: From altruism to antagonism, fairness plays a central role in social interactions. But can we truly understand how fair someone is, especially without explicit knowledge of their preferences? We cast this challenge as a multi-agent inverse reinforcement learning problem, explicitly structuring rewards to reflect how agents value the welfare of others. We introduce novel Bayesian strategies, reasoning about the optimality of demonstrations and characterisation of equilibria in general-sum Markov games. Our experiments, spanning randomised environments and a collaborative cooking task, reveal that coherent notions of fairness can be reliably inferred from demonstrations. Furthermore, when isolating fairness components, we obtain a disentangled understanding of agents preferences. Crucially, we unveil that by placing agents in different groups, we can force them to exhibit new facets of their reward structures, cutting through ambiguity to answer the central question: who is being fair?
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Victor_Villin1
Track: Regular Track: unpublished work
Submission Number: 125
Loading