Partial Optimal Transport for Support Subset Selection

Published: 06 Dec 2023, Last Modified: 06 Dec 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: In probabilistic terms, optimal transport aims to find a joint distribution that couples two distributions and minimizes the cost of transforming one distribution to another. Any feasible coupling necessarily maintains the support of both distributions. However, maintaining the entire support is not ideal when only a subset of one of the distributions, namely the source, is assumed to align with the other target distribution. For these cases, which are common in machine learning applications, we study the semi-relaxed partial optimal transport problem that relaxes the constraints on the joint distribution allowing it to under-represent a subset of the source by over-representing other subsets of the source by a constant factor. In the discrete distribution case, such as in the case of two samples from continuous random variables, optimal transport with the relaxed constraints is a linear program. When sufficiently relaxed, the solution has a source marginal with only a subset of its original support. We investigate the scaling path of solutions, specifically the relaxed marginal distribution for the source, across different relaxations and show that it is distinct from the solutions from penalty-based semi-relaxed unbalanced optimal transport problems and fully-relaxed partial optimal transport, which have previously been explored. We demonstrate the usefulness of this support subset selection in applications such as color transfer, partial point cloud alignment, and semi-supervised machine learning, where a part of data is curated to have reliable labels and another part is unlabeled or has unreliable labels. Our experiments show that optimal transport under the relaxed constraint can improve the performance of these applications by allowing for more flexible alignment between distributions.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: 1. New experimental results for PU learning using fully-relaxed partial Wasserstein are added in introduction to clarify our statement regarding Chapel 2020. 2. Since Chapel-2020 introduces additional constraints in addition to regular optimal transport constraints, therefore we created a separate subssubsection $\textbf{Strictly uniform, semi-relaxed partial optimal transport}$ in subsection $\textbf{Relation to Prior Work}$ of methodology section. 3. Changes to figures and their captions are made as requested. Refrence: Laetitia Chapel, Mokhtar Z Alaya, and Gilles Gasso. Partial Optimal Tranport with Applications on Positive-Unlabeled Learning. Advances in Neural Information Processing Systems, 33:2903–2913, 2020.
Supplementary Material: zip
Assigned Action Editor: ~Rémi_Flamary1
Submission Number: 1378
Loading