\begin{figure*}[t]
	\centering 
	\vskip -0.1cm
	\subfigure[]{
% 		{\includegraphics[width=3.5cm]{plots/q_a_error.png}}
        \hspace{-0.3cm}
		{\includegraphics[width=3.35cm]{plots/qme_a_bw.pdf}}
		\label{fig:q_rec_a}
				% \caption{a = 1}
	}
	\subfigure[]{
% 		{\includegraphics[width=3.5cm]{plots/q_B_error.png}}
        \hspace{-0.3cm}
		{\includegraphics[width=3.35cm]{plots/qme_B_bw.pdf}}
		\label{fig:q_rec_B}
		%		\caption{a = 1}
	}
	\subfigure[]{
% 		{\includegraphics[width=3.5cm]{plots/f_a_error.png}}
        \hspace{-0.3cm}
		{\includegraphics[width=3.35cm]{plots/fpme_a.pdf}}
		\label{fig:f_rec_a}
		%		\caption{a = 1}
	}
	\subfigure[]{
% 		{\includegraphics[width=3.5cm]{plots/f_B_error.png}}
        \hspace{-0.3cm}
		{\includegraphics[width=3.35cm]{plots/fpme_B.pdf}}
		\label{fig:f_rec_B}
		%		\caption{a = 1}
	}
	\subfigure[]{
% 		{\includegraphics[width=3.5cm]{plots/f_l_error.png}}
        \hspace{-0.3cm}
		{\includegraphics[width=3.35cm]{plots/fpme_lamb.pdf}}
		\label{fig:f_rec_l}
		%		\caption{a = 1}
	}
	\vskip -0.2cm
	\caption{Average elicitation error over 100 metrics as a function of number of classes $k$ and groups $m$ for quadratic metrics in Definition \ref{def:quadmet} (a--b) and fairness metrics in Definition \ref{def:f-linmetric} (c--e). See Table~\ref{tab:numqueries} in Appendix \ref{append:sec:extexp} for the number of queries needed.
% 	The results are averaged over 100 random metrics. 
% 	[Placeholder figure]
	}
	\label{fig:recovery}
	\vskip -0.15cm
\end{figure*}


% \vspace{-0.15cm}
\section{Guarantees}
\label{sec:guarantees}
% \vskip -0.15cm

We discuss guarantees for the QPME procedure under the following practically relevant feedback model. %, which is useful in practice. 
The fair metric elicitation guarantees follow %directly 
as a consequence.

% \vspace{-0.1cm}
\bdefinition[Oracle Feedback Noise: $\epsilon_\Omega \geq 0$] Given rates $\rmbf_1, \rmbf_2$, % \in \Rcal$, 
the oracle responds correctly iff $|\phi^{\quadr}(\rmbf_1) - \phi^{\quadr}(\rmbf_2)| > \epsilon_\Omega$ and may be incorrect otherwise.
\label{def:noise}
\edefinition
% \vskip -0.2cm

In words, the oracle may respond incorrectly if the rates are close 
 as measured by the metric. 
Since eliciting the metric involves offline computations %including certain 
of ratios, %we discuss guarantees under the
we make a %following
regularity assumption
ensuring that all components are well defined. 
% \vspace{-0.1cm}
\bassumption
For the shifted quadratic metric $\bphi$ in~\eqref{eq:quadmetshift},  %assume that 
the gradients at the rate profiles $\ombf$, $-\zmbf_1$, and $\{\zmbf_1, \dots, \zmbf_q\}$, are non-zero vectors. 
Additionally, $\rho > \varrho \gg \epsilon_\Omega$.
\label{as:regularity-q}
\eassumption
% \vskip -0.2cm

% \vspace{-0.15cm}
\btheorem
Given $\epsilon,\epsilon_\Omega\geq 0$, and a 1-Lipschitz metric $\phi^{\quadr}$ (Def.\ \ref{def:quadmet}) parametrized by $\ambf, \Bmbf$, under Assumptions~\ref{assump:distribution},  \ref{assump:smoothness},  and \ref{as:regularity-q}, after $O\left(k^2\log \tfrac 1 {\epsilon}\right)$ queries, Algorithm~1 returns a metric $\hphi^{\quadr} = (\ambfhat, \Bmbfhat)$ with
% \vspace{-0.25cm}
% \begin{itemize}[itemsep=0pt, leftmargin=1em]
     %$\ambfhat$ such that 
    $\Vert \ambf-\ambfhat \Vert_{2}\leq O\left(\sqrt{k}(\epsilon+\sqrt{\varrho+\epsilon_\Omega/\varrho})\right)$ 
    and %$\Bmbfhat$ such that 
    $\Vert \Bmbf -\Bmbfhat \Vert_{F}\leq O\left(k\sqrt{k}(\epsilon+\sqrt{\varrho + \epsilon_\Omega/\varrho})\right)$.
% \end{itemize}
\label{thm:q-me}
\etheorem
% \vskip -0.1cm
% The proof in App.\ \ref{append:sec:guarantees} 
The proof of Theorem~\ref{thm:q-me} uses the guarantee for LPME \emph{only} as an intermediate step, and substantially builds on it to take into account the smoothness of the non-linear metric, the multiplicative errors in the slopes, and the feedback noise. We also provide a \emph{finite sample version} of Theorem~\ref{thm:q-me} in Corollary~\ref{append:cor:finite} (Appendix~\ref{append:sec:guarantees}), which states that the above result holds with high probability as long as (i) the hypothesis class of classifiers has finite capacity, and (ii) the number of samples used to estimate the rates is large enough. 
\btheorem(\textbf{Lower Bound})
For any $\epsilon > 0$, at least $\Omega(k^2\log(1/(k\sqrt k\epsilon)))$ pairwise queries are needed to 
to elicit a quadratic metric (Def.\ \ref{def:quadmet})
to an error tolerance of $k\sqrt k\epsilon$. %for some (slack) $\epsilon$. 
\label{thm:lb}
\etheorem
Theorem~\ref{thm:q-me} shows that the QPME procedure is robust to noise and its query complexity depends only \emph{linearly} in the number of unknowns. Theorem~\ref{thm:lb} shows that the inherent complexity of the problem depends on the \emph{number of unknowns}, thus our query complexity is optimal (barring the log term). So the $\tilde{O}(k^2)$ complexity is merely an artifact of our setup  in Definition~\ref{def:quadmet} being very general (with $O(k^2)$ unknowns).
Indeed, with added structural assumptions on the metric, our proposal can be modified to considerably reduce the query complexity. For example, if we know that the matrix $\Bmbf$ is diagonal, then each  LPME subroutine call  
needs to estimate only one parameter, which can be done with a constant number of queries, requiring a total of only {\small$\tilde O(k)$} queries. % which is again \emph{linear} in the number of unknowns.
We also stress that despite eliciting a more complex (non-linear) metric, the query complexity is still \emph{linear in the number of unknowns}, which is same as prior linear elicitation methods ~\citep{hiranandani2018eliciting, hiranandani2019multiclass}. 