Learning Dispersed Embeddings on Hyperspheres

Evgeniia Tokarchuk; Hua Chang Bakker; Vlad Niculae

Learning Dispersed Embeddings on Hyperspheres

Evgeniia Tokarchuk, Hua Chang Bakker, Vlad Niculae

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: embeddings, dispersion, hypersphere, representation learning, separation

Abstract: Learning well-separated features in high-dimensional spaces, such as text or image $\textit{embeddings}$, is crucial for many machine learning applications. Achieving such separation can be effectively accomplished through the $\textit{dispersion}$ of embeddings, where unrelated vectors are pushed apart as much as possible. By constraining features to be on a $\textit{hypersphere}$, we can connect dispersion to well-studied problems in mathematics and physics, where optimal solutions are known for limited low-dimensional cases. However, in representation learning we typically deal with a large number of features in high-dimensional space, which makes leveraging existing theoretical and numerical solutions impossible. Therefore, we rely on gradient-based methods to approximate the optimal dispersion on a hypersphere. In this work, we first give an overview of existing methods from disconnected literature. Next, we propose new reinterpretations of known methods, namely Maximum Mean Discrepancy (MMD) and Lloyd’s relaxation algorithm. Finally, we derive a novel dispersion method that directly exploits properties of the hypersphere. Our experiments show the importance of dispersion in image classification and natural language processing tasks, and how algorithms exhibit different trade-offs in different regimes.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6622

Loading