On the Finite-Sample Bias of Minimizing Expected Wasserstein Loss Between Empirical Distributions

Cheongjae Jang; Yung-Kyun Noh

On the Finite-Sample Bias of Minimizing Expected Wasserstein Loss Between Empirical Distributions

Cheongjae Jang, Yung-Kyun Noh

Published: 03 Feb 2026, Last Modified: 02 May 2026AISTATS 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We show that minimizing the expected Wasserstein loss between empirical distributions can lead to biased parameter estimates in the finite-sample regime. Remarkably, such bias arises even in well-specified settings where both empirical distributions are drawn from the same parametric family: unlike maximum likelihood estimation—understood here as maximizing the expected log-likelihood—optimizing one parameter while fixing another fails to recover the true fixed value. We derive closed-form expressions for the expected Wasserstein loss in one dimension and, focusing on location–scale models, provide an analytic characterization of the bias. This analysis reveals that finite-sample bias occurs whenever the expected loss varies along the diagonal subspace where parameter values coincide, and we propose a simple correction scheme that removes this effect. We extend our analysis to misspecified models and the Sinkhorn divergence, demonstrating that finite-sample bias persists in more practical settings. Experiments on synthetic and real data confirm that stochastic optimization of Wasserstein-based objectives converges to biased solutions, and validate the effectiveness of the proposed correction scheme.

Code Dataset Promise: Yes

Code Dataset Url: https://github.com/Cheongjae/finite-sample-bias-wasserstein

Signed Copyright Form: pdf

Format Confirmation: I agree that I have read and followed the formatting instructions for the camera ready version.

Submission Number: 1932

Loading