Abstract: We propose the first data augmentation method based on optimal transport theory, with the generated data being guaranteed to belong to the original data manifold. The proposed algorithm randomly samples a clique in the nearest-neighbors graph representing the data knowledge and computes the Wasserstein barycenter between the neighbours with random uniform weights. Being extremely natural-looking, many such barycenters are then produced iteratively to overpopulate the original dataset. We apply this approach to the problem of landmarks detection in unsupervised and semi-supervised scenarios in the popular tasks of face keypoints extraction, pose detection, and the segmentation of anatomical contours in medical imaging. The barycentric oversampling approach is shown to outperform state-of-the-art data augmentation methods. The code is available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/cviaai/LAMBO/</uri> .
0 Replies
Loading