Conditional Wasserstein Barycenters and Interpolation/Extrapolation of Distributions

Published: 01 Jan 2025, Last Modified: 14 May 2025IEEE Trans. Inf. Theory 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Increasingly complex data analysis tasks motivate the study of the dependency of distributions of multivariate continuous random variables on scalar or vector predictors. Statistical regression models for distributional responses are a key technique for the emerging field of distributional data analysis, but so far have primarily been investigated for the case of one-dimensional response distributions. We investigate here the challenging case of conditional Fréchet means for multivariate response distributions under the Wasserstein metric, which has not been studied previously but is relevant for statistical data analysis in various fields, including climatology and health, as we demonstrate with real data applications. A second innovation is that we harness the notion of conditional barycenters and geodesics in the Wasserstein space to interpolate as well as extrapolate multivariate distributions under suitable regularity conditions, where even the simpler case of extrapolating one-dimensional distributions has not been studied before. We cover both global parametric-like and local smoothing-like models to implement conditional Wasserstein barycenters and establish asymptotic convergence properties for the corresponding estimates under suitable regularity assumptions. The utility of distributional inter- and extrapolation is explored in both simulations and examples. Conditional Wasserstein barycenters and distribution extrapolation are specifically illustrated with applications in epidemiology and climatology.
Loading