Balancing the Scales: Reinforcement Learning for Fair Classification

Balancing the Scales: Reinforcement Learning for Fair Classification

ACL ARR 2024 June Submission2974 Authors

15 Jun 2024 (modified: 03 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Fairness in classification tasks has traditionally focused on bias removal from neural representations, but recent trends favor algorithmic methods that embed fairness into the training process. These methods steer models towards fair performance, preventing potential elimination of valuable information that arises from representation manipulation. Reinforcement Learning (RL), with its capacity for learning through interaction and adjusting reward functions to encourage desired behaviors, emerges as a promising tool in this domain. In this paper, we explore the usage of RL to address bias in multi-class classification by scaling the reward function to mitigate bias. We employ the contextual multi-armed bandit framework and adapt three popular RL algorithms to suit our objectives, demonstrating a novel approach to mitigating bias.

Paper Type: Long

Research Area: Machine Learning for NLP

Research Area Keywords: model bias/unfairness mitigation, reinforcement learning

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 2974

Loading