Magnitude Distance: A Geometric Measure of Dataset Similarity

Published: 31 Oct 2025, Last Modified: 28 Nov 2025EurIPS 2025 Workshop PriGMEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Dataset distance, Distance metrics, Magnitude of metrics
TL;DR: We introduce magnitude distance, a distance measuring the dissimilarity between finite sets that considers the geometrical properties and structure of the data, and demonstrate its theoretical properties such as metric axioms and outlier robustness.
Abstract: Quantifying the distance between datasets is a fundamental question in mathematics and machine learning. We propose magnitude distance, a novel distance metric defined on datasets that is based on the notion of the magnitude of a metric space. It is an intuitive and outlier-robust geometric distance between two finite sets in $\mathbb{R}^D$. We prove various properties of magnitude distance, including aspects of the metric axioms and how it can be tuned to pay more attention to local versus global structures. An experimental example demonstrating the outlier robustness property of this approach is also given.
Submission Number: 20
Loading