Online Social Welfare Function-based Resource Allocation

Kanad Shrikar Pardeshi; Samsara Foubert; Aarti Singh

Online Social Welfare Function-based Resource Allocation

Kanad Shrikar Pardeshi, Samsara Foubert, Aarti Singh

Published: 30 Apr 2026, Last Modified: 24 Jun 2026ICML 2026 regularEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: This work studies online learning and anytime-valid inference for general social welfare objectives under partial feedback.

Abstract: In many real-world settings, a centralized decision-maker must repeatedly allocate finite resources to a population over multiple time steps. Individuals who receive a resource derive some stochastic utility; to characterize the population-level effects of an allocation, the expected individual utilities are then aggregated using a social welfare function (SWF). We formalize this setting and present a general confidence sequence framework for SWF-based online learning and inference, valid for any monotonic, concave, and Lipschitz-continuous SWF. Our key insight is that monotonicity alone suffices to lift confidence sequences from individual utilities to anytime-valid bounds on optimal welfare. Building on this foundation, we propose SWF-UCB, a SWF-agnostic online learning algorithm that achieves near-optimal $\tilde{\mathcal{O}}(n+\sqrt{nkT})$ regret (for $k$ resources distributed among $n$ individuals at each of $T$ time steps). We instantiate our framework on three normatively distinct SWF families: Weighted Power Mean, Kolm, and Gini, providing bespoke oracle algorithms for each. Experiments confirm $\sqrt{T}$ scaling and reveal rich interactions between $k$ and SWF parameters. This framework naturally supports inference applications such as sequential hypothesis testing, optimal stopping, and policy evaluation.

Lay Summary: In many real-world settings, a centralized decision-maker must repeatedly allocate finite resources to a population over multiple time steps. Individuals in the population receive some random utility upon receiving the resource. A natural goal for the decision-maker is to learn/infer a randomized allocation to maximize an aggregate of the expected individual utilities, with these aggregations specified by social welfare functions. In this work, we develop a statistical framework to enable online learning and inference of these optimal allocations. Along with providing the general framework, we consider three popular families of social welfare functions and provide exact methods for them. We also conduct synthetic experiments on the online learning setup to study variations in outcomes with changing problem parameters.

Originally Submitted Supplementary Material: zip

Primary Area: Theory->Online Learning and Bandits

Keywords: Resource allocation, social welfare functions, fairness, online learning and inference

Originally Submitted PDF: pdf

Submission Number: 29237

Loading