Abstract: Discrimination is a focal concern in supervised learning algorithms augmenting human decision-making. These systems are trained using historical data, which may have been tainted by discrimination, and may learn biases against the protected groups. An important question is how to train models without propagating discrimination. In this study, we i) define and model discrimination as perturbations of a data-generating process and show how discrimination can be induced via attributes correlated with the protected attributes; ii) introduce a measure of resilience of a supervised learning algorithm to potentially discriminatory data perturbations, iii) propose a novel supervised learning algorithm that inhibits discrimination, and iv) show that it is more resilient to discriminatory perturbations in synthetic and real-world datasets than state-of-the-art learning algorithms. The proposed method can be used with general supervised learning algorithms and avoids inducement of discrimination, while maximizing model accuracy.
0 Replies
Loading