Abstract: In research studying the fairness of machine learning algorithms and models, fairness often means that a metric is the same when computed for two different groups of people. For example, one might define fairness to mean that the false positive rate of a classifier is the same for people of different genders, ages, or races. However, it is usually not possible to make this metric identical for all groups. Instead, algorithms ensure that the metric is similar—for example, that the false positive rates are similar. Researchers usually measure this similarity or dissimilarity using either the difference or ratio between the metric values for different groups of people. Although these two approaches are known to be different, there has been little work analyzing their differences and respective benefits. In this paper we examine this relationship analytically and empirically, and conclude that unless there are application-specific reasons to prefer the difference approach, the ratio approach should be preferred.
Loading