DECAF: Learning to be Fair in Multi-agent Resource Allocation

Published: 07 Aug 2024, Last Modified: 07 Aug 2024RLSW 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Confirmation: Yes
Keywords: Fairness, Multi-agent RL, MARL, Resource Allocation, DQN, Constraint Optimization
TL;DR: We present methods to learn fair policies in resource allocation problems with centralized constraints that use RL to estimate utilities.
Abstract: A wide variety of resource allocation problems operate under resource constraints that are managed by a central arbitrator, with agents who evaluate and communicate preferences over these resources. We formulate this class of Distributed Evaluation, Centralized Allocation (DECA) problems, and propose methods to learn fair and efficient policies in centralized resource allocation. Our methods are applied to learning long-term fairness in a novel and general framework for fairness in multi-agent systems. We show three different methods: (1) a joint weighted optimization of fairness and utility, (2) a split optimization, learning two separate Q-estimators for utility and fairness, and (3) an online policy perturbation to guide an existing black-box utility function towards fair solutions. Through experiments on multiple domains, we compare these methods and discuss relevant use cases for each of them. We also highlight an important and overlooked factor in learning to act fairly in constrained multi-agent problems: The importance of past-discounting and warm starts for learning fair behavior.
Submission Number: 2
Loading