Bypassing Skip-Gram Negative Sampling: Dimension Regularization as a More Efficient Alternative for Graph Embeddings

David Mingfei Liu; Arjun Seshadri; Tina Eliassi-Rad; Johan Ugander

Bypassing Skip-Gram Negative Sampling: Dimension Regularization as a More Efficient Alternative for Graph Embeddings

David Mingfei Liu, Arjun Seshadri, Tina Eliassi-Rad, Johan Ugander

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: graph embeddings, negative sampling, skip gram, dimension regularization

Abstract: A wide range of graph embedding objectives decompose into two components: one that attracts the embeddings of nodes that are perceived as similar, and another that repels embeddings of nodes that are perceived as dissimilar. Without repulsion, the embeddings would collapse into trivial solutions. Skip-Gram Negative Sampling (SGNS) is a popular and efficient repulsion approach that prevents collapse by repelling each node from a sample of dissimilar nodes. In this work, we show that when repulsion is most needed and the embeddings approach collapse, SGNS node-wise repulsion is, in the aggregate, an approximate re-centering of the node embedding dimensions. Such dimension operations are much more scalable than node operations and yield a simpler geometric interpretation of the repulsion. Our result extends findings from self-supervised learning to the skip-gram model, establishing a connection between skip-gram node contrast and dimension regularization. We use this observation to propose a flexible algorithm augmentation framework that improves the scalability of any existing algorithm using SGNS. The framework prioritizes node attraction and replaces SGNS with dimension regularization. We instantiate this generic framework for LINE and node2vec and show that the augmented algorithms preserve downstream link-prediction performance while reducing GPU memory usage by up to $33.3$% and training time by $22.1$%. Further, for graphs that are globally sparse but locally dense, we show that removing repulsion altogether can improve performance, but, when repulsion is otherwise needed, dimension regularization provides an effective and efficient alternative to SGNS.

Supplementary Material: zip

Primary Area: learning on graphs and other geometries & topologies

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12324

Loading