\vspace{-0.1cm}
\section{Eliciting Fairness Metrics}
\label{sec:fairme}
\vskip -0.1cm
Having understood the QPME procedure, we now discuss how our proposal can be applied to \emph{quadratic metric elicitation for algorithmic fairness}. 
Like~\cite{hiranandani2020fair}, we consider eliciting a metric that trades-off between predictive performance and fairness violation \citep{kamishima2012fairness, chouldechova2017fair, menon2018cost}. 
However, unlike~\cite{hiranandani2020fair}, 
we handle general quadratic fairness violations and show how QPME can be easily employed to elicit group-fair metrics. 

\vspace{-0.2cm}
\subsection{Fairness Preliminaries}
\label{ssec:fpmebackground}
\vskip -0.1cm
The fairness setting is the same as the one in Section~\ref{sec:quadme} except that we additionally have $m$ groups in the data and use $g \in [m]$ to denote the group membership. The groups are assumed to be disjoint, fixed, and known apriori 
% ~\cite{hardt2016equality, agarwal2018reductions, barocas2016big}. 
\citep{agarwal2018reductions}.
We will work with a separate (randomized) classifiers $h^g : \Xcal \rightarrow \Delta_k$ for each group $g$, and use 
 $\Hcal^g = \{h^g : \Xcal \rightarrow \Delta_k\}$
 to denote the set of all classifiers for %a  group 
 $g$. 
 
\emph{Group predictive rates:} Similar to~\eqref{eq:components}, we denote the group-conditional rates for  $h^g$ 
by $\rmbf^g(h^g, \Pmbb) \in \Rmbb^{k}$, where the $i$-th entry is additionally conditioned on group $g$: % and is given by:
% \vspace{-0.05cm}
\begin{align}
	r^g_{i}(h^g, \Pmbb) \coloneqq \Pmbb(h^g = i | Y = i, G= g )\,\forall \,i \in [k].
	\label{eq:f-components}
\end{align}
% \vskip -0.2cm
Analogous to the general setup, 
% we denote the group rates by vectors $\rmbf^g(h^g, \Pmbb) = \offdiag(\Rmbf^g(h^g, \Pmbb))$, and 
we denote the set of feasible rates for group $g$  by $\Rcal^g = \{\rmbf^g(h^g, \Pmbb) \,:\, h^g \in \Hcal^g \}$. 
% For clarity, we will suppress the dependence on $\Pmbb$ and $h^g$ if it is clear from the context.

\bexample[Fairness violation]
\emph{
A popular criterion for group fairness is the equal opportunity criterion of \cite{hardt2016equality}, which for a binary classification setup with $m$ protected groups, would require that $r_1^u = r_1^v$ for each pair of groups $(u,v)$. This can be formulated as constraints $|r_1^u - r_1^v|\leq \epsilon$, for some slack $\epsilon$ for all pairs $(u,v)$~\citep{agarwal2018reductions}, or more generally as a regularization term in the learning objective~\citep{bechavod2017learning, hardt2016equality},  by measuring the squared difference between the group rates: $\phi^{\text{EOpp}}((\rmbf^{1},\dots,\rmbf^{m}))
 = {{m\choose 2}}^{-1}\sum_{v>u}(r_1^u - r_1^v)^2$. 
Another popular criterion is {equalized odds}, which requires equal rates across different protected groups 
% \cite{hardt2016equality,bechavod2017learning}.
\citep{bechavod2017learning}.
This again can be specified as a quadratic objective:
%for each group. 
% With $m$ groups, this is given by:
$\phi^{\eo}((\rmbf^{1},\dots,\rmbf^{m})) \,=\, {[k{m\choose 2}]}^{-1}\sum_{ v>u}\sum_{i=1}^k \left(r^u_i - r^v_i\right)^2$. 
Other fairness criteria that can be expressed as quadratic metrics 
include 
% {equal opportunity} $\phi^{\text{EOpp}}((\rmbf^{1},\dots,\rmbf^{m})) = {{m\choose 2}}^{-1}\sum_{v>u}(r_1^u - r_1^v)^2$~\citep{hardt2016equality}, 
 {balance for the negative class}, which for a binary classification problem is given by $\phi^{\text{BN}}((\rmbf^{1},\dots,\rmbf^{m}))
  = {{m\choose 2}}^{-1}\sum_{v>u}(r_2^u - r_2^v)^2$~\citep{kleinberg2017inherent}, and the {error-rate balance} $\phi^{\text{EB}}((\rmbf^{1},\dots,\rmbf^{m})) =  {{m\choose 2}}^{-1}\frac{1}{2}\sum_{v>u}(r_1^u - r_1^v)^2 + (r_2^u - r_2^v)^2$~\citep{chouldechova2017fair}  and their weighted variants. 
}
\eexample
\vskip -0.1cm

In the next section, we introduce a general family of metrics that trades-off between an %weighted
error term and a quadratic fairness violation term,
for which we will need to define the rates for the overall classifier.

\emph{Rates for overall classifier:} We construct the overall classifier $h : (\Xcal, [m]) \rightarrow \Delta_k$ by predicting with classifier $h^g$ for group $g$, i.e.\ $h(\xmbf, g) \coloneqq h^g(\xmbf)$.  We will be interested in both the fairness violation and predictive performance of the overall classifier. 
For the former, we will need the $m$ group-specific rates, represented together as a tuple: 
$$\rmbf^{1:m} \coloneqq  (\rmbf^1, \dots, \rmbf^m) \in \Rcal^1 \times \dots \times \Rcal^m =: \prodRcal.$$ 
For the latter, we will measure the overall rates for $h$ as described in~\eqref{eq:components}. The overall rates can also be written in terms of group-specific rates as:
$\rmbf = \sum_{g=1}^m \bm{\tau}^g \odot \rmbf^g,$ where $\bm{\tau}^g$ is just a constant vector whose $i$-th entry denote the prevalence of group $g$ within class $i$, i.e., $\Pmbb(G=g|Y=i)$.

\vspace{-0.2cm}
\subsection{Fair Quadratic Metric Elicitation}
\label{ssec:f-metric}
\vskip -0.1cm

We seek to elicit a metric %similar to %Definition~1 in ~\cite{hiranandani2020fair}, 
that trades-off between predictive performance (a linear function of overall rates $\rmbf$) and fairness violation (a quadratic function of group rates $\rmbf^{1:m}$). For simplicity, we will denote the fairness metric in cost form, i.e., lower values are better. 

\begin{figure}[t]
    \centering
    \hspace{-0.1cm}
    % \vskip -0.1cm
    \includegraphics[scale=0.5]{plots/Fair_QME_horizontal.png}
    \vskip -0.1cm
     \caption{Eliciting Fair Quadratic Metrics %(Definition \ref{def:fpme}) 
     for two groups. We formulate a $k$-dimensional elicitation problem and use a variant of QPME (Algorithm~1).
    }
    \label{fig:fairness-workflow}
    \vskip -0.45cm
\end{figure}

\bdefinition \emph{(Fair Quadratic Performance Metric)} For misclassification costs  $\ambf \in \Rmbb^k$, $\ambf \geq 0$, 
% fairness violation costs $\mathbb{B} \,=\, \{\Bmbf^{uv} \in \Rmbb^{q\times q}\}_{u, v=1, v>u}^m$, 
fairness violation costs $\mathbb{B} \,=\, \{\Bmbf^{uv} \in PSD_k\}_{u, v=1, v>u}^m$, 
and a trade-off parameter $\lambda \in [0,1]$, we define:
% \vspace{-0.15cm}
\begin{align*}
&\phi^\fair(\tupr; \ambf, \mathbb{B}, \lambda) \,\coloneqq\, (1-\lambda)\inner{\ambf}{\bm{1} - \rmbf} ~+~
\\&
\qquad\quad\quad
\lambda \frac{1}{2} \left(\sum\nolimits_{v>u} (\rmbf^u - \rmbf^v)^T\mathbbm{\Bmbf}^{uv}(\rmbf^{u} - \rmbf^v)\right)
    \numberthis \label{eq:f-linmetric},
\end{align*}
% \vskip -0.25cm
where w.l.o.g.\
the parameters $\ambf$ and $\Bmbf^{uv}$'s are normalized:
% \begin{align*}
    $\Vert \ambf \Vert_2 = 1, \, \frac{1}{2}\sum_{v>u}^{m} \Vert \Bmbf^{uv} \Vert_F = 1.$
    % \numberthis
    % \label{eq:f-scaleinvariance}
% \end{align*}
\label{def:f-linmetric}
\edefinition

The coefficients $\ambf, \Bmbf^{uv}$'s are separately normalized so that the predictive performance and fairness violation are in the same scale, and we can additionally elicit the trade-off parameter $\lambda$. Analogous to Definitions \ref{def:query}--\ref{def:me}, the problem of \emph{Fair Quadratic Metric Elicitation} is as follows: given access to pairwise oracle queries of the form $\Omega(\tuprhat_1, \tuprhat_2)$, recover a metric $\hphi^\fair = (\ambfhat, \hat{\mathbb{B}}, \lambdahat)$ such that $\Vert\phi^\fair - \hphi^\fair\Vert < \kappa$ under a  suitable norm $\Vert \cdot \Vert$ for small  $\kappa > 0$. 

Similar to Section \ref{ssec:mpme}, we study the space of feasible rates $\Rcal^{1:m}$ under the following mild assumption. 

\bassumption
For all $g\in[m]$, the conditional distributions $\Pmbb(Y=j|X, G=g), \, j \in [k],$ are distinct, i.e., there is some signal for non-trivial classification for each group.
\label{as:f-sphere}
\eassumption
\bproposition
[Geometry of $\prodRcal$; Figure~\ref{fig:geometry}(b)] For each group $g$, a classifier that predicts class $i$ on all inputs results in the same rate vector $\embf_i$. The rate space $\Rcal^g$ for each group $g$ is convex and so is the intersection 
$\Rcal^1 \cap \dots \cap \Rcal^m$, which also contains the rate profile $\ombf = \tfrac{1}{k} \tiny{\sum_{i=1}^k \embf_i}$ (achieved by the uniform random classifier) in the interior. 
\label{prop:f-C}
\eproposition

\bremark[Existence of sphere $\overline{\Scal}$]
There exists a %$q$-dimensional 
sphere $\overline{\Scal} \subset \Rcal^1 \cap \dots \cap \Rcal^m$ of radius $\rho$ centered at $\ombf$. Thus, a rate $\smbf \in\overline{\Scal}$ is feasible for each of the $m$ groups, i.e.,\ $\smbf$ is achievable by some classifier $h^g$ for each group $g \in [m]$.
\label{as:f-sphere}
\eremark
\vskip -0.1cm
% \vskip  -0.1cm

Because we allow a separate classifier for each group, Remark~\ref{as:f-sphere} implies that any rate  $\rmbf^{1:m} = (\smbf^1, \ldots, \smbf^m)$ for arbitrary points $\smbf^1, \ldots, \smbf^m \in \overline{\Scal}$ is achievable for some choice of group-specific classifiers $h^1, \ldots, h^m$. This observation will be key to the elicitation algorithm we describe next.

\vspace{-0.2cm}
\subsection{Eliciting Metric Parameters $({\ambf}, \mathbb{B}, \lambda)$}
\vskip -0.1cm
We present a strategy for eliciting fair metrics 
% (Def. \ref{def:f-linmetric}) 
by adapting the QPME algorithm. For simplicity, we  focus on the $m=2$ case and extend our approach for $m>2$ 
% multiple groups 
in Appendix \ref{append:sec:fpme}. 


Observe that for a rate profile $\rmbf^{1:2} = (\smbf, \ombf)$, where the first group is assigned an arbitrary point in $\overline{\Scal}$ and the second group  is assigned the uniform random classifier's rate $\ombf$, the fair metric~\eqref{eq:f-linmetric} becomes: $\phi^\fair((\smbf, \ombf); \ambf,\, \Bmbf^{12}, \lambda)$
% \vspace{-0.4cm}
\begin{align*}
& \hspace{-0.5cm}\coloneqq (1-\lambda)\inner{\ambf}{\bm{1} - (\bm{\tau}^1 \odot \smbf + \bm{\tau}^2 \odot \ombf)} ~~+ \\
% &\quad\quad\quad
&\qquad \qquad \qquad \frac{\lambda}{2} (\smbf - \ombf)^T\Bmbf^{12}(\smbf - \ombf)\vspace{-0.2cm}\\
& \hspace{-0.5cm}\coloneqq \inner{\dmbf}{\smbf - \ombf} +
\frac{1}{2} (\smbf - \ombf)^T\Bmbf(\smbf - \ombf) \\ & \hspace{-0.5cm}\coloneqq \overline{\phi}(\smbf; \dmbf, \Bmbf),
    \numberthis \label{eq:f-linmetricshift}
\end{align*}
% \vskip -0.25cm
where $\dmbf = -(1-\lambda)\taumbf^1\odot\ambf$ and $\Bmbf = \lambda \Bmbf^{12}$, and we use $\taumbf^1 + \taumbf^2 =\bm{1}$ (the vector of ones) for the second step. The metric $\overline{\phi}$ above is a particular instance of the quadratic metric in~\eqref{eq:quadmetshift}.  %in $q$ dimensions. 
We can thus apply a slight variant of the QPME procedure in Algorithm~1 to solve the quadratic metric elicitation problem over the sphere $\Scal' = \{(\smbf, \ombf) \,|\, \smbf \in \overline{\Scal}\}$ with the modified oracle $\Omega'(\rmbf_1, \rmbf_2) = \Omega((\rmbf_1, \ombf), (\rmbf_2, \ombf))$. 

The only change needed for the algorithm is in line 5, where 
we need to account for the changed relationship between $\dmbf$ and $\ambf$ and need to separately (not jointly) normalize the linear and quadratic coefficients. With this change, the output of the algorithm directly gives us the required estimates. 
Specifically, from step 1 of Algorithm~1 and \eqref{eq:0col}, we have $\hat{d}_i = -(1-\lambda)\tau^1_i \hat{a}_i$. By normalizing $\dmbf$, we 
% directly get an 
get $\ambfhat = \frac{\dmbf}{\|\dmbf\|}$ for the linear coefficients. 
Similarly, steps 2-4 of Algorithm~1 and \eqref{eq:poly2elicitamatfinal} 
allow us to express $\hat{B}_{ij} =\lambda\hat{B}^{12}_{ij}$ in terms of $\hat{a}_1$. After normalizing we directly get estimates 
$\Bmbfhat^{12} = {\Bmbfhat}/{\|\Bmbfhat\|_F}$ for the quadratic coefficients.

Finally, because the linear and quadratic coefficients are separately  normalized, the estimates $\ambfhat,\, \Bmbfhat^{12}$ are independent of the trade-off parameter $\lambda$.  
% \subsection{Eliciting Trade-off Parameter $\lambda$}
Given estimates {\small$\hat{B}^{12}_{ij}$} and $\ahat_1$,  we can now additionally estimate the trade-off parameter {\small$\hat{\lambda}$}. See Appendix \ref{append:sec:fpme} for details %. However, it is quite easy by virtue of recovering $\ambfhat$ and $\Bmbfhat^{12}$ above. We can
%from~\eqref{eq:fairBij}. 
and Figure \ref{fig:fairness-workflow} for an illustration. % of the procedure.