Keywords: Riemannian metric, geodesics, energy-based model, data-driven metric, energy landscape, shortest-path
TL;DR: We derive Riemannian metrics from pretrained EBMs to compute data-aware geodesics. Our approach outperforms standard methods across datasets, offering a scalable solution for learning data geometry in high-dimensional spaces.
Abstract: What is the shortest path between two data points lying in a high-dimensional space? While the answer is trivial in Euclidean geometry, it becomes significantly more complex when the data lies on a curved manifold—requiring a Riemannian metric to describe the space's local curvature. Estimating such a metric, however, remains a major challenge in high dimensions.
In this work, we propose a method for deriving Riemannian metrics directly from pretrained Energy-Based Models (EBMs)—a class of generative models that assign low energy to high-density regions.
These metrics define spatially varying distances, enabling the computation of geodesics—shortest paths that follow the data manifold’s intrinsic geometry. We introduce two novel metrics derived from EBMs and show that they produce geodesics that remain closer to the data manifold and exhibit lower curvature distortion, as measured by alignment with ground-truth trajectories.
We evaluate our approach on increasingly complex datasets: synthetic datasets with known data density, rotated character images with interpretable geometry, and high-resolution natural images embedded in a pretrained VAE latent space.
Our results show that EBM-derived metrics consistently outperform established baselines, especially in high-dimensional settings.
Our work is the first to derive Riemannian metrics from EBMs, enabling data-aware geodesics and unlocking scalable, geometry-driven learning for generative modeling and simulation.
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 12484
Loading