Towards Understanding Gradient Dynamics of the Sliced-Wasserstein Distance via Critical Point Analysis
Abstract: In this paper, we investigate the properties of the Sliced Wasserstein Distance (SW) when employed as an objective functional. The SW metric has gained significant interest in the optimal transport and machine learning literature, due to its ability to capture intricate geometric properties of probability distributions while remaining computationally tractable, making it a valuable tool for various applications, including generative modeling and domain adaptation.
Our study aims to provide a rigorous analysis of the critical points arising from the optimization of the SW objective. By computing explicit perturbations, we establish that stable critical points of SW cannot concentrate on segments. This stability analysis is crucial for understanding the behaviour of optimization algorithms for models trained using the SW objective. Furthermore, we investigate the properties of the SW objective, shedding light on the existence and convergence behavior of critical points. We illustrate our theoretical results through numerical experiments.
Lay Summary: Generative modeling consists of training a computer to create new content based on a set of existing data (for example, new pictures of birds based on a dataset of existing pictures). To do so, it is generally useful to be able to quantify how far away the content generated by the program is from the actual data. In other words, we need to be able to calculate a distance between datasets, and it turns out that there is a mathematical theory which is well-suited to this task : the theory of optimal transport.
In our work, we focus on one of the tools of optimal transport, the so-called "Sliced-Wasserstein distance", which, thanks to its many advantageous properties (notably its ease of calculation), has recently garnered much interest in the machine learning community. We uncover theoretical and empirical evidence that machine learning algorithms that make use of this distance will work seamlessly, and avoid getting "stuck" at some state where the computer becomes unable to improve its performance.
Link To Code: https://github.com/cvauthier/Critical-Points-of-Sliced-Wasserstein
Primary Area: Optimization
Keywords: optimal transport, optimisation, sliced-wasserstein distance
Submission Number: 6955
Loading