Choosing the parameter of the Fermat distance: navigating geometry and noise

Published: 26 Jun 2024, Last Modified: 26 Jun 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: The Fermat distance has been recently established as a valuable tool for machine learning tasks when a natural distance is not directly available to the practitioner or to improve the results given by Euclidean distances by exploiting the geometrical and statistical properties of the dataset. This distance depends on a parameter $\alpha$ that significantly affects the performance of subsequent tasks. Ideally, the value of $\alpha$ should be large enough to navigate the geometric intricacies inherent to the problem. At the same time, it should remain restrained enough to avoid any deleterious effects stemming from noise during the distance estimation process. We study both theoretically and through simulations how to select this parameter.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Some minor changes were made to address some of the reviewers comments. Some references and clarifications were added in the introduction. A remark after Proposition 3.2.
Assigned Action Editor: ~Florent_Krzakala1
Submission Number: 1885