Abstract: Metric embeddings are fundamental in machine learning, enabling similarity search, dimensionality reduction, and representation learning. They underpin modern architectures like transformers and large language models, facilitating scalable training and improved generalization.
Theoretically, the classic problem in embedding design is mapping arbitrary metrics into $\ell_p$ spaces while approximately preserving pairwise distances. We study this problem in a fully dynamic setting, where the underlying metric is a graph metric subject to edge insertions and deletions.
Our goal is to maintain an efficient embedding after each update.
We present the first fully dynamic algorithm for this problem, achieving $O(\log(n))^{2q} O(\log(nW))^{q-1}$ expected distortion with $O(m^{1/q + o(1)})$ update time and $O(q \log(n) \log(nW))$ query time, where $q \ge 2$ is an integer parameter.
Lay Summary: Embeddings are a key technique in machine learning. They turn data—like words or nodes in a network—into vectors, allowing algorithms to reason about distances and similarities. Embeddings power many modern systems, including large language models, where they help represent meaning in a way machines can work with.
We study this problem from a theoretical angle by modeling data as a graph and designing embeddings that preserve distances even as the graph changes over time. Specifically, we focus on the fully dynamic setting, where edges can be both inserted and deleted. Prior work could only handle limited cases, like when edge weights only increase. We give the first algorithm that efficiently maintains a low-distortion embedding into $\ell_p$ space under fully dynamic updates. We also prove that certain stronger guarantees are impossible to achieve efficiently, which shows our result is nearly the best one can hope for.
Link To Code: NTg1N
Primary Area: Theory->Everything Else
Keywords: Dynamic algorithms, Embedding
Submission Number: 7757
Loading