Fairness in Cooperative Multiagent Multiobjective Reinforcement Learning using the Expected Scalarized Return
Keywords: Multi-agent learning, Reinforcement learning, Fairness, Multi-objective learning
Abstract: Fairness as equity and compromise across multiple viewpoints is a necessary consideration in any kind of decision that is evaluated from several possibly conflicting perspectives. It is also a property that artificial decision-making agents should uphold to be deployable to real-world problems.
In the sequential decision-making community, focus has been put on designing algorithms that ensure fairness either only among agents or just among objectives.
However, most real-world problems can't be reduced to the optimization of a single objective and are concerned with the control of a fleet of cooperative agents. The multi-objective and multi-agent nature of such problems makes existing algorithms inadequate. Indeed, single-objective multi-agent approaches are not adapted for multi-objective optimization and single-agent multi-objective approaches cannot handle multiple agents. Furthermore, research integrating fairness into Multi-objective Reinforcement Learning (MORL) is focused on the scalarized expected return (SER) optimization criterion while mostly ignoring the expected scalarized reward criterion (ESR). We argue that fairness in MORL should also be investigated under ESR since sometimes it is more suitable when solving problems where fairness matters. In this paper, we consider the problem of learning objective-wise fair policies in cooperative multi-agent multi-objective sequential decision-making problems. We propose the first mono-policy algorithm able to learn efficient decentralized policies while ensuring fairness across objectives under ESR. Our algorithm is evaluated on a novel environment that models a cooperative multi-objective multi-agent task and achieves better performances than the considered baselines.
Supplementary Material: zip
Type Of Paper: Full paper (max page 8)
Anonymous Submission: Anonymized submission.
Submission Number: 33
Loading