Deterministic Compression of Word Embeddings

Published: 01 Jan 2025, Last Modified: 30 Jun 2025IEEE Access 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Word embeddings are an indispensable technology in the field of artificial intelligence, particularly when working with natural language processing models. To further enhance their usability, several studies have tackled the compression of word embeddings while maintaining task performance. In the same vein, this study proposes a word embedding compression method that vastly differs from recent neural compression methods. A notable characteristic of the proposed method is that it guarantees a stable and reproducible compressed representation under identical configurations. The central idea of the proposed method is utilizing a deterministic procedure and a convex optimization. More specifically, we leverage the property of obtaining identical solutions for identical convex optimization problems to construct a deterministic procedure throughout the entire compression process. We conduct experiments on both intrinsic and extrinsic evaluation with various word embeddings. Our experimental results show the effectiveness of the proposed method in terms of the performance vs. compression ratio trade-off and the ability to offer reproducible results. Notably, the results of the machine translation experiment on the extrinsic evaluation show that the proposed method compresses the word embedding layer of the Long Short-Term Memory (LSTM) Decoder by a factor of 258 with a 0.2 point decrease in BLEU Score.
Loading