Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition

Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition

ACL ARR 2025 May Submission31 Authors

06 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Language models can be viewed as functions that embed text into Euclidean space, where the quality of the embedding vectors directly determines model performance, training such neural networks involves various uncertainties. This paper focuses on improving the performance of pre-trained language models in zero-shot settings through a simple and easily implementable method. We propose a novel backward attention mechanism to enhance contextual information encoding. Our approach achieves significant improvements across multiple tasks, providing valuable insights for advancing zero-shot learning capabilities. \footnote{Our code will be available after the review process.}

Paper Type: Long

Research Area: Semantics: Lexical and Sentence-Level

Research Area Keywords: Semantics: Lexical and Sentence-Level, Interpretability and Analysis of Models for NLP, Machine Learning for NLP

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings

Languages Studied: chinese

Keywords: Semantics: Lexical and Sentence-Level, Interpretability and Analysis of Models for NLP, Machine Learning for NLP

Submission Number: 31

Loading