Small Transformers Compute Universal Metric Embeddings

Anastasis Kratsios, Valentin Debarnot, Ivan Dokmanić

27 Mar 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: We study representations of data from an arbitrary metric space in the space of univariate Gaussian mixtures with a transport metric (Delon and Desolneux 2020). We derive embedding guarantees for feature maps implemented by small neural networks called \emph{probabilistic transformers}. Our guarantees are of memorization type: we prove that a probabilistic transformer of depth about and width about can bi-H\"{o}lder embed any -point dataset from with low metric distortion, thus avoiding the curse of dimensionality. We further derive probabilistic bi-Lipschitz guarantees which trade off the amount of distortion and the probability that a randomly chosen pair of points embeds with that distortion. If 's geometry is sufficiently regular, we obtain stronger, bi-Lipschitz guarantees for all points in the dataset. As applications we derive neural embedding guarantees for datasets from Riemannian manifolds, metric trees, and certain types of combinatorial graphs.

0 Replies