Keywords: optimal transport, manifold learning, simulation-based inference, flow matching
TL;DR: We propose a highly effective method for identifying local manifold structure by fit a local linear model mapped through a pre-trained OT flow.
Abstract: The manifold hypothesis states that high-dimensional data often concentrate near low-dimensional structures. Identifying these local manifold structures allows us to separate signals from ambient noise.
In this paper, we study the problem of identifying local manifold structure from data.
Given a query point $y_0$ from a dataset,
our goal is to recover the
geometry of the data manifold in a neighborhood of $y_0$
using a
\emph{pretrained} optimal transport flow from
the reference to the data distribution.
First, we prove that the Brenier optimal transport map preserves
manifold structure: the preimage of an $m$-dimensional data manifold
is itself an $m$-dimensional manifold in the reference space.
Second, motivated by this result, we propose a latent variable model that maps a linear model through the transport flow.
We prove that the linear approximation error is significantly reduced by the optimal transport map, leading to a tight fit of the non-linear data manifold.
Third, noting the intractability of the resulting likelihood, we deploy
denoising Fisher score estimation --- a recent development from
simulation-based inference that learns the Fisher score over
parameter--observation pairs --- to perform likelihood-based inference effectively.
Experiments on both synthetic and real-world datasets demonstrate the effectiveness of the proposed method.
Submission Number: 102
Loading