Remembering to Be Fair Again: Reproducing Non-Markovian Fairness in Sequential Decision Making

TMLR Paper4294 Authors

21 Feb 2025 (modified: 18 Mar 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Ensuring long-term fairness in sequential decision-making is a key challenge in machine learning. Alamdari et al. (2024) introduced FairQCM, a reinforcement learning algorithm that enforces fairness in non-Markovian settings via memory augmentations and counterfactual reasoning. We reproduce and extend their findings by validating their claims and introducing novel enhancements. We confirm that FairQCM outperforms standard baselines in fairness enforcement and sample efficiency across different environments. However, alternative fairness metrics (Egalitarian, Gini) yield mixed results, and counterfactual memories show limited impact on fairness improvement. Further, we introduce a realistic COVID-19 vaccine allocation environment based on SEIR, a popular compartmental model of epidemiology. To accommodate continuous action spaces, we develop FairSCM, which integrates counterfactual memories into a Soft Actor-Critic framework. Our results reinforce that counterfactual memories provide little fairness benefit and, in fact, hurt performance, especially in complex, dynamic settings. The original code, modified to be 70% more efficient, and our extensions will be available on GitHub.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Ian_A._Kash1
Submission Number: 4294
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview