\section{Introduction}
% \looseness -3 
% \vspace{-1mm}
Many decision making tasks involve maximization of utility functions \citep{chen2015submodular, jackson2019value}.  
As an example, utility in active learning (AL) can be represented in various forms, such as expected error rate reduction \citep{mussmann2022active, roy2001toward}, mutual information between the labeled and unlabeled datasets \citep{sourati2016classification, adaimi2019leveraging, lindley1956measure}, or the uncertainty of model predictions \citep{settles2012active, shen2017deep, kossen2022active}. 
However, maximizing utility under budget constraints in AL is notoriously challenging. It is well-known that determining the optimal set containing maximal information under cardinality constraint is NP-hard \citep{ko1995exact, chen2015sequential}. In classification tasks, determining the groundtruth utility of \textit{subset} of training data needs retraining classifier on that set (and then evaluate it on the validation set). It's computationally infeasible to calculate out the utility for the \textit{best} possible subset for downstream tasks without carefully examining every possible subset \citep{engstrom2024dsdm}. 
% \looseness -2
Moreover, common AL methods rely on acquisition functions with high \textit{adaptivity} to the environment, in which the selection choices for current round depend on the responses to the labeling requests for all previous rounds.
This reliance poses major concerns for the deployment of these algorithms to real-world applications, as there could be a substantial delay between requesting labels and receiving feedback. For instance, in scientific experiments, feedback from wet-lab or physics experiments can take days or even months to obtain \citep{botu2015adaptive, yang2019machine}, limiting the rounds of interactions with labelers, thus bearing the risk of sampling redundant or less effective training examples within a batch.

%To simplify the setting of problem, 
We thus ask: \textit{How to develop a robust acquisition criterion for AL with only one round of interaction with annotators given fixed budget constraints?}
So far, dominant AL approaches rely on customized utility metrics characterizing the current model's behavior. 
Recent works \citep{ash2019deep, killamsetty2021glister, saran2023streaming, sener2017active} propose to use gradients of the current model based on the \textit{pseudo labels} of the unlabeled data. Yet, these gradient estimates can be unreliable for single round AL setting due to the limited data of labeled pool. 
The datamodels framework \citep{ilyas2022datamodels} showcases the linear relationship between training data and model predictions, a seemingly promising paradigm for designing acquisition function.
It is worth noting that the framework is under \textit{supervised} settings, i.e. requiring \textit{labeled} subsets of training data and studies how the choice of training set affect model predictions. Conversely, acquisition criteria in AL are defined as function mapping from \textit{unlabeled} instances, or instances without label information, to real utility value.

\begin{figure*}[!t]%
\centering
  \includegraphics[width=\textwidth]{fig/overview.pdf}
  % \vspace{-2mm}
  \caption{Overview of the \algname algorithm. For pretraining stage, we learn a RankNet over pairs of utility samples via multi-task bilevel optimization; for acquisition stage, we follow the learned utility function to iteratively query data points in minibatches. Details of the algorithm are provided in Section~\ref{DUAL_MAX}.}\label{fig:overview}
  % \vspace{-2mm}
\end{figure*}

In this paper, we focus on enhancing the \textit{robustness} and \textit{generalizability} of deep active learning under one round setting. Given the variability in deep learning models due to different initializations, hyperparameters, network architectures and training procedures \citep{jiang2021assessing, d2022underspecification,zhong2021larger}, the one-shot estimate of validation accuracy can be highly stochastic, and thus we resort to the idea of \textit{ranking} as a strategy to mitigate the inherent uncertainty. Rather than learning a predictor for validation accuracy, we shift the perspective towards (approximately) comparing which subset of unlabeled pool would lead to better generalization on validation set. Concretely, in Section~\ref{DUAL_MAX}, we aim to \textit{approximate} the relative utility value of equal size subset of training data via a novel variant of RankNet \citep{burges2005learning}, which we refer to as the \textit{utility model}. This is achieved by integrating a set-based neural network architecture, enabling us to extend comparisons from individual \textit{examples} to pairs of \textit{sets}. 

\looseness -1 To accommodate the increasing labeled pool, we separate samples based on the size of inputs and employ bilevel training to account for the growing training history. We introduce a multi-task learning framework that uses the optimal transport distance \citep{alvarez2020geometric} between the current labeled data and validation set as an additional loss, regularizing the utility model to enhance generalization to new, unlabeled data, while being agnostic to training dynamics of the underlying classifier. Furthermore, to refine the utility model estimation and reduce the computational overhead of obtaining groundtruth utility samples during the pretraining stage, we employ interpolation-based techniques to augment utility samples (defined in Section 4.1). 

We summarize the above algorithmic insights into a novel learning-based acquisition strategy, namely \algname (\underline{R}anking-based \underline{A}ctive learning via \underline{M}ultitask \underline{B}ilevel \underline{O}ptimization), as illustrated in \figref{fig:overview}. 
We conducted extensive experiments on various active learning benchmarks in image classification, and showed that \algname consistently outperforms existing learning/regression-based active learning algorithms by a significant margin.  % tasks to demonstrate the effectiveness of the proposed approach. 
Our method offers a promising alternative for maximizing data utility under budget constraints, unlocking potential applications in a wide range of classification tasks.

%that addresses the limitations of existing methods reliant on expensive acquisition functions or overly generic heuristics. Our algorithm is summarized in \figref{fig:overview}.

%To refine the utility model estimation and reduce the requests for groundtruth utility samples during the pretraining stage.


% In summary, 
% \vspace{-2mm}
% \begin{itemize} \denselist
% \item We propose a novel learning-based acquisition strategy called \algname (\underline{R}anking-based \underline{A}ctive learning via \underline{M}ultitask \underline{B}ilevel \underline{O}ptimization) that addresses the limitations of existing methods reliant on expensive acquisition functions or overly generic heuristics. Our algorithm is summarized in \figref{fig:overview}.
% \item We introduce a bi-level learning algorithm to enhance validation performance, enabling the learning of generalizable utility functions as the labeled data grows over time.
% \item We employ interpolation-based techniques to augment utility samples (defined in Section~\ref{sec:framework}), 
% refining utility model estimation and reducing the requests for groundtruth utility samples during the pretraining stage.
% \item We incorporate a multi-task learning approach, leveraging the optimal transport distance between the labeled dataset and validation set as a regulatory loss, guiding the behavior of our RankNet.
% \item We conduct extensive experiments on various image classification tasks, demonstrating the effectiveness of our proposed approach. Our method also offers a promising alternative for maximizing data utility under budget constraints, unlocking potential applications in a wide range of classification tasks.
% \end{itemize}
% \vspace{-1mm}






