\section{Related Work}\label{sec:rl}


A distinct feature of GMP and GMPMW is their staged progression of task cost observability—initially unknown, partially observable as a distribution upon task arrival, and fully revealed as an exact value upon task completion. 
Figure~\ref{fig:survey} compares the progression of task cost observability for different problems. Task cost observability may increase at four key instants: before task arrival, upon arrival, upon discarding, and after processing, categorized into three levels: Unknown (no information), Distribution (distribution known), and Value (value known). 

We compare GMP and GMPMW (red) against the Online Generalized Assignment Problem (OGAP), Online Stochastic Generalized Assignment Problem (OSGAP), Online Knapsack Problem (OKP), Online Stochastic Knapsack Problem (OSKP), and the Bandit with Knapsack (BwK) problem, including its Full Feedback (BwK-F) and Bandit Feedback (BwK-B) variants. Unlike other problems that experience only two levels of task cost observability, GMP and GMPMW are the only ones that undergo all three levels. We elaborate below on these differences, with a more detailed comparison provided in Appendix~\ref{sec:A.B}.

\begin{figure}[t]
    \centering
    \includegraphics[width=1\linewidth]{Figures/surveyMain.pdf}
    \caption{Progression of task cost observability in online problems.}
    \label{fig:survey}
\end{figure}


\subsection{GMP, OGAP, and OSGAP}

GMP was originally studied by~\citet{alaei2013online} with a competitive ratio of $1-K^{-1/2}$ achieved, where $K$ is the resource budget. The Magician's Problem (MP)~\citet{alaei2014bayesian} is a special case of GMP with random $0$-$1$ cost.~\citet{srinivasan2022generalized} studied a variant of GMP considering unknown distributions of resource consumption. GMP has then been applied in e-commerce~\citep{amil2022multi} and transportation~\citep{jiang2022approximation}. However, none of the above works can accommodate the multiple workers in GMPMW.

GMP has been adopted to tackle other online problems, such as OSGAP~\citep{alaei2013online}. In OSGAP, each task belongs to a type drawn from a known distribution. Upon arrival, the task's type is revealed, specifying the reward and the cost distribution. \citet{yoshinaga2023size} extended OSGAP to consider a more limited resource budget.~\citet{liu2023online} and~\citet{li2023sample} studied OGAP with adversarial task rewards and costs.
However, as shown in Figure~\ref{fig:survey}, both OSGAP and OGAP follow a similar observability progression: the exact task cost is revealed upon arrival (OGAP) or upon processing (OSGAP) without an intermediate level. Consequently, none of these problems considers the full progression of task cost observability involving all three levels as in GMP and GMPMW.

\subsection{OKP, OSKP, and BwK}
  OKP~\citep{zhou2008budget,bockenhauer2014online} assumes that the decision-maker initially has no knowledge of either task rewards or costs, and OSKP assumes that the decision-maker knows the distribution~\citep{papastavrou1996dynamic,dean2008approximating} or the exact value~\citep{jiang2022tight} of task costs.
  However, the exact task cost in both problems is revealed upon arrival.
  In BwK~\citep{badanidiyuru2018bandits,immorlica2022adversarial,dragobandits}, the task's reward and cost are not known to the decision-maker and are revealed after processing the task (BwK-B) or after the task is either processed or discarded (BwK-F). To summarize, as illustrated in Figure~\ref{fig:survey}, none of OKP, OSKP, or BwK consider the full progression of task cost observability as in GMP and GMPMW.


\subsection{Other Online Optimization Problems}

Recently,~\citet{feldman2021online} introduced the Online Contention Resolution Scheme (OCRS) as a rounding scheme for solving the Online Bayesian Optimization Problem (BOP)~\citep{chawla2010multi,gupta2013stochastic,feldman2021online,jiang2022tight}. However, all of these works consider deterministic task costs known to the decision-maker at the beginning. 
Moreover, the lack of prior knowledge about task rewards and costs prevents the construction of a linear relaxation for solving GMP via OCRS and necessitates fundamentally different approaches.

There are additional online optimization problems that have structural differences from GMP and GMPMW, such as the One-Way Trading Problem (OTP)~\citep{el2001optimal,lin2019competitive, cao2020optimal}, the Online Bipartite Matching Problem~\citep{mehta2013online,dickerson2021allocation, ijcai2023p607}, and the online Pandora's Box problem~\citep{esfandiari2019online,boodaghians2020pandora,gatmiry2024bandit}. We discuss these differences in more detail in Appendix~\ref{sec:A.B.1}.


