Clustering Embedding Tables, Without First Learning Them

Henry Ling-Hei Tsang; Thomas Dybdahl Ahle

Clustering Embedding Tables, Without First Learning Them

Henry Ling-Hei Tsang, Thomas Dybdahl Ahle

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Clustering, Sketching, Recommendation Systems, Embeddings, Sparse Matrices

TL;DR: We train recommendation systems using less memory than previous work. This is achieved using clustering of a "pseudo embedding table" trained via hashing.

Abstract: Machine learning systems use embedding tables to work with categorical features. These tables may get extremely large in modern recommendation systems, and various methods have been suggested to fit them in memory. Product- and Residual Vector Quantization are some of the most successful methods for table compression. They function by substituting table rows with references to ``codewords'' picked by k-means clustering. Unfortunately, this means that they must first know the table before compressing it, thus they can only save memory at inference time, not training time. Recent work has employed hashing-based approaches to minimize memory usage during training, however the compression obtained is poorer than that achieved by ``post-training'' quantization. We demonstrate that combining hashing and clustering based algorithms provides the best of both worlds. By first training a hashing-based ``sketch'', then clustering it, and then training the clustered quantization, our method may achieve compression ratios close to those of post-training quantization with the training time memory reductions of hashing-based methods. We prove that this technique works rigorously in the least-square setting.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

Supplementary Material: zip

10 Replies

Loading