Abstract: We study *online multicalibration*, a framework for ensuring calibrated predictions across multiple groups in adversarial settings, across $T$ rounds. Although online calibration is typically studied in the $\ell_1$ norm, prior approaches to online multicalibration have taken the indirect approach of obtaining rates in other norms (such as $\ell_2$ and $\ell_{\infty}$) and then transferred these guarantees to $\ell_1$ at additional loss. In contrast, we propose a direct method that achieves improved and oracle-efficient rates of $\widetilde{\mathcal{O}}(T^{-1/3})$ and $\widetilde{\mathcal{O}}(T^{-1/4})$ respectively, for online $\ell_1$-multicalibration. Our key insight is a novel reduction of online $\ell_1$-multicalibration to an online learning problem with product-based rewards, which we refer to as *online linear-product optimization* ($\mathtt{OLPO}$).
To obtain the improved rate of $\widetilde{\mathcal{O}}(T^{-1/3})$, we introduce a linearization of $\mathtt{OLPO}$ and design a no-regret algorithm for this linearized problem. Although this method guarantees the desired sublinear rate (nearly matching the best rate for online calibration), it is computationally expensive when the group family $\mathcal{H}$ is large or infinite, since it enumerates all possible groups. To address scalability, we propose a second approach to $\mathtt{OLPO}$ that makes only a polynomial number of calls to an offline optimization (*multicalibration evaluation*) oracle, resulting in *oracle-efficient* online $\ell_1$-multicalibration with a corresponding rate of $\widetilde{\mathcal{O}}(T^{-1/4})$. Our framework also extends to certain infinite families of groups (e.g., all linear functions on the context space) by exploiting a $1$-Lipschitz property of the $\ell_1$-multicalibration error with respect to $\mathcal{H}$.
Lay Summary: In many real-world applications—like loan approval, medical diagnosis, or hiring—machine learning algorithms make predictions that impact different groups of people. A key metric used to evaluate the performance of such probability forecasters is *calibration*, which requires that, for any predicted probability $p \in [0,1]$, the actual frequency of the event should converge to $p$. However, a major limitation of standard calibration is that it may still yield systematically biased predictions for specific subpopulations defined by features like gender, race, or age.
Our work addresses this limitation through a stronger fairness notion called *multicalibration*, which ensures that predictions are accurate not just on average, but across all relevant subgroups. We study this problem in an online setting, where predictions must be made sequentially, without access to future data—a scenario that captures the constraints of many real-time decision-making systems.
We develop new algorithms that are both fast and theoretically sound, ensuring fair predictions even when feedback is limited and data arrives in a stream. Our approach introduces a novel way to reframe the problem—making it easier to solve and analyze using tools from online optimization. As a result, we obtain improved guarantees on how quickly fairness can be achieved, and we ensure that our algorithms remain efficient enough for practical deployment.
Primary Area: General Machine Learning->Online Learning, Active Learning and Bandits
Keywords: online multicalibration, online learning, oracle-efficient algorithms
Submission Number: 12045
Loading