Abstract: Modern DNN-based recommendation systems rely on training-derived real-valued embeddings of sparse categorical features. Input sparsity makes obtaining high-quality embeddings for rarely-occurring categories harder as their representations are updated infrequently. We demonstrate an effective overparameterization technique for enhancing embeddings training by enabling useful cross-category learning. Our scheme trains embeddings using training-time forced factorization of the embedding (linear) layer, with an inner dimension higher than the target embedding dimension.
We show that factorization breaks update sparsity via non-homogeneneous weighting of dense base embedding matrices. Such weighting controls the magnitude of weight updates in each embedding direction, and is adaptive to training-time embedding singular values. The dynamics of singular values further explains the puzzling importance of factorization inner dimension on learning enhancements.
We call the scheme multi-layer embeddings training (MLET). For deployment efficiency, MLET converts the trained two-layer embedding into a single-layer one at the conclusion of training, avoiding inference-time model size increase. MLET consistently produces better models when tested on multiple recommendation models for click-through rate (CTR) prediction. At constant model quality, MLET allows embedding dimension reduction by up to 16x, and 5.8x on average, across the models. MLET retains its benefits in combination with other table reduction methods (hashing and quantization).
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
5 Replies
Loading