On the Finite-Sample Bias of Minimizing Expected Wasserstein Loss Between Empirical Distributions

Published: 22 Sept 2025, Last Modified: 01 Dec 2025NeurIPS 2025 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Wasserstein distance, Empirical distributions, Finite-sample bias, Parameter estimation
Abstract: We show that minimizing the expected Wasserstein loss between empirical distributions can lead to biased parameter estimates in finite-sample regimes. Specifically, when two empirical distributions are sampled from the same parametric family—one at a fixed parameter value and the other at a variable—we find that minimizing the expected loss with respect to the variable parameter generally fails to recover the fixed one. We analytically verify this bias in simple one-dimensional settings, including location-scale models, by deriving closed-form expressions for the expected empirical Wasserstein loss. The analysis reveals that when the expected loss varies along the diagonal (where the two parameter values coincide), the gradient at the fixed parameter value is nonzero, shifting the minimizer away from it. To address this, we propose a simple correction scheme that eliminates the bias in well-specified cases. Numerical experiments confirm that stochastic gradient descent on the empirical Wasserstein loss converges to biased solutions and demonstrate the effectiveness of the proposed bias correction scheme.
Submission Number: 113
Loading