Repetition Improves Language Model Embeddings

Jacob Mitchell Springer; Suhas Kotha; Daniel Fried; Graham Neubig; Aditi Raghunathan

Repetition Improves Language Model Embeddings

Jacob Mitchell Springer, Suhas Kotha, Daniel Fried, Graham Neubig, Aditi Raghunathan

Published: 22 Jan 2025, Last Modified: 10 Mar 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: embeddings

TL;DR: Repeat the input twice for high quality embeddings

Abstract: Bidirectional models are considered essential for strong text embeddings. Recent approaches to adapt autoregressive language models (LMs) into strong text embedding models have largely had the requirement to modify the LM architecture to be bidirectional. We challenge this premise by introducing ``echo embeddings'' which converts autoregressive LMs into high quality text embedding models \emph{without} changing the architecture or requiring fine-tuning. By repeating the input and extracting embeddings from the repeated tokens—which have access to all original tokens—echo embeddings improve over classical LM embeddings by over 5\% in zero-shot settings. Our zero-shot embeddings nearly match those obtained by bidirectionally-converted LMs that undergo additional masked-language modeling training. Echo embeddings are also compatible with supervised fine-tuning, matching or outperforming bidirectionally-converted LMs in an apples-to-apples comparison, even with an identical compute budget during training and inference. Overall, repetition is a simple and effective strategy to circumvent the need for bidirectional attention in embedding models, paving the way towards a unified architecture for all NLP tasks.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7856

Loading