On-Demand Sampling: Learning Optimally from Multiple Distributions

Nika Haghtalab; Michael Jordan; Eric Zhao

On-Demand Sampling: Learning Optimally from Multiple Distributions

Nika Haghtalab, Michael Jordan, Eric Zhao

Published: 31 Oct 2022, Last Modified: 04 Aug 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: Sample Complexity, Distributionally Robust Optimization, Collaborative Learning, Learning Theory, Minmax Equilibria, Online Mirror Descent

Abstract: Societal and real-world considerations such as robustness, fairness, social welfare and multi-agent tradeoffs have given rise to multi-distribution learning paradigms, such as collaborative [Blum et al. 2017], group distributionally robust [Sagawa et al. 2019], and fair federated learning [Mohri et al. 2019]. In each of these settings, a learner seeks to minimize its worstcase loss over a set of $n$ predefined distributions, while using as few samples as possible. In this paper, we establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity. Importantly, our sample complexity bounds exceed that of the sample complexity of learning a single distribution only by an additive factor of $\frac{n\log(n)}{\epsilon^2}$. These improve upon the best known sample complexity of agnostic federated learning by Mohri et al. 2019 by a multiplicative factor of $n$, the sample complexity of collaborative learning by Nguyen and Zakynthinou 2018 by a multiplicative factor $\frac{\log(n)}{\epsilon^3}$, and give the first sample complexity bounds for the group DRO objective of Sagawa et al. 2019. To achieve optimal sample complexity, our algorithms learn to sample and learn from distributions on demand. Our algorithm design and analysis extends stochastic optimization techniques to solve zero-sum games in a new stochastic setting.

TL;DR: We give optimal sample complexity bounds for several multi-distribution learning problems using insights from finding min-max equilibria in stochastic zero-sum games.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/on-demand-sampling-learning-optimally-from/code)

13 Replies

Loading