Group Fairness in Reinforcement Learning via Multi-Objective Rewards

Published: 03 May 2024, Last Modified: 03 May 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Recent works extend classification group fairness measures to sequential decision processes such as reinforcement learning (RL) by measuring fairness as the difference in decision-maker utility (e.g. accuracy) of each group. This approach suffers when decision-maker utility is not perfectly aligned with group utility, such as in repeat loan applications where a false positive (loan default) impacts the groups (applicants) and decision-maker (lender) by different magnitudes. Some works remedy this by measuring fairness in terms of group utility, typically referred to as their "qualification", but few works offer solutions that yield group qualification equality. Those that do are prone to violating the "no-harm" principle where one or more groups' qualifications are lowered in order to achieve equality. In this work, we characterize this problem space as having three implicit objectives: maximizing decision-maker utility, maximizing group qualification, and minimizing the difference in qualification between groups. We provide a RL policy learning technique that optimizes for these objectives directly by constructing a multi-objective reward function that encodes these objectives as distinct reward signals. Under suitable parameterizations our approach is guaranteed to respect the "no-harm" principle.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Lihong_Li1
Submission Number: 1926