When Embeddings Models Meet: Procrustes Bounds and Applications

ICLR 2026 Conference Submission17294 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: representation learning, embedding models, orthogonal procrustes
TL;DR: We show when embedding models can be aligned via orthogonal Procrustes with tight error guarantees, enabling interoperability across retraining, model upgrades, and modalities.
Abstract: Embedding models trained separately on similar data often produce representations that encode stable information but are not directly interchangeable. This lack of interoperability raises challenges in several practical applications, such as model retraining, partial model upgrades, and multimodal search. We study when two sets of embeddings can be aligned by an orthogonal transformation. We show that if pairwise dot products are approximately preserved, then there exists an isometry that closely aligns the two sets, and we provide a tight bound on the alignment error. This insight yields a simple alignment recipe, Procrustes post-processing, that makes two embedding models interoperable while preserving the geometry of each embedding space. Empirically, we demonstrate its effectiveness in three applications: maintaining compatibility across retrainings, combining different models for text retrieval, and improving mixed-modality search, where it achieves state-of-the-art performance.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 17294
Loading