Shared Embedding Optimization: A Two-Stage Approach for Efficient and Effective Feature Embedding

Shared Embedding Optimization: A Two-Stage Approach for Efficient and Effective Feature Embedding

ICLR 2026 Conference Submission20508 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Shared Embeddings, Embedding Compression, Recommendation Systems

Abstract: Large-scale recommendation systems and other industrial machine learning models depend heavily on embeddings for representing categorical and sparse features. The resulting embedding tables often constitute the majority of a model's parameters, creating a significant bottleneck for training and serving. We propose Shared Embedding Optimization (SEO), a novel two-stage approach to learn compact but effective embeddings. In the first stage, a search model utilizing shared embedding tables and a sequential attention mechanism identifies the most salient embedding subspaces (chunks) from a large pool of candidates. In the second stage, an applier model is trained from scratch, leveraging only the small subset of selected chunks. This results in a significantly smaller and more efficient final model without sacrificing predictive performance. We demonstrate the effectiveness of SEO on the Criteo display ads challenge, showing it is competitive with traditional embedding techniques while substantially reducing the number of embedding parameters.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 20508

Loading