Fair Kernel Regression through Cross-Covariance Operators

Published: 19 Jul 2023, Last Modified: 19 Jul 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Ensuring fairness in machine learning models is a difficult problem from both a formulation and implementation perspective. One sensible criterion for achieving fairness is Equalised Odds, which requires that subjects in protected and unprotected groups have equal true and false positive rates. However, practical implementation is challenging. This work proposes two ways to address this issue through the conditional independence operator. First, given the output values, it is used as a fairness measure of independence between model predictions and sensitive variables. Second, it is used as a regularisation term in the problem formulation, which seeks optimal models that balance performance and fairness concerning the sensitive variables. To illustrate the potential of our approach, we consider different scenarios. First, we use the Gaussian model to provide new insights into the problem formulation and numerical results on its convergence. Second, we present the formulation using the conditional cross-covariance operator. We anticipate that a closed-form solution is possible in the general problem formulation, including in the case of a kernel formulation setting. Third, we introduce a normalised criterion of the conditional independence operator. All formulations are posed under the risk minimisation principle, which leads to theoretical results on the performance. Additionally, insights are provided into using these operators under a Gaussian Process setting. Our methods are compared to state-of-the-art methods in terms of performance and fairness metrics on a representative set of real problems. The results obtained with our proposed methodology show promising performance-fairness curves. Furthermore, we discuss the usefulness of linear weights in the fair model to describe the behaviour of the features when enforcing fairness over a particular set of input features.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: We have incorporated the suggested improvements provided by the anonymous reviewers. Additionally, we have provided a point-by-point response addressing their concerns. We have highlighted the reviewers' questions in bold and provided our corresponding answers in regular text. **The authors need to clarify their contributions. For instance many ideas (HSIC, Cross-Covariance) have been proposed in other papers and are clearly stated as such. But the use of conditional independence that is claimed as a contribution (contribution 2) was already proposed by Fukumizu et al. which btw is stated in the main paper. There are enough novel results in the paper to justify acceptance without this contribution.** Thank you for your positive comment and for raising the point of contributions in our work. We agree in that the components were there already (HSIC, cross-covariance operators, conditional independence based on kernels), but our use is distinct because of the specific use in a new formulation and its design to tackle a particularly important issue in algorithmic fairness. Actually, we appreciate the opportunity to clarify the differences between our proposal and the work of Fukumizu et al. (2009) in their paper titled ``Kernel Dimension Reduction in Regression''. Fukumizu et al. formulated their method using the conditional independence statement $Y\perp X|\Pi_S X$. This statement captures the notion of the covariate $X$ being conditionally independent of the response $Y$, given the projection of $X$ onto the subspace $\Pi_S$. Their approach involves optimizing over the Stiefel manifold, denoted as $\mathcal{S}_{d}^{m}(\mathbb{R})$, where $d\leq m$, which represents, in mathematical terms, the set of $m\times d$ real-valued matrices $B$ that satisfy the condition $B^{\top}B=I_d$. In contrast, our proposal utilizes the *equalized odds* condition, $\hat{Y} \perp S|Y$, as a regulariser within a least squares framework, along with kernel expansion. This regularisation approach enables us to derive closed-form solutions. By incorporating the conditional independence assumption as a regularizer, we achieve efficient and analytically tractable solutions to the problem. In summary, while Fukumizu et al. tackled the problem using an optimization-based approach on the Stiefel manifold, we propose a regularisation-based approach within the Least Squares setting, leading to closed-form solutions. **The authors must recall the assumptions for applying the represented theorem and details how it can be used on equation (3). The response to the reviewer about this question was clear but it should appear in the paper. maybe the notation $f(X)$ is unclear and should appear as $f(x_i)_i$?** Thanks for pointing this out. We have clarified this notation in the new version of the manuscript.
Code: https://www.uv.es/pesuaya/data/code/2023_FACIL.zip
Assigned Action Editor: ~Rémi_Flamary1
Submission Number: 987