Mitigate Position Bias in Large Language Models via Scaling a Single Dimension

Yijiong Yu; Huiqiang Jiang; Xufang Luo; Qianhui Wu; Chin-Yew Lin; Dongsheng Li; Yuqing Yang; Yongfeng Huang; Lili Qiu

Mitigate Position Bias in Large Language Models via Scaling a Single Dimension

Yijiong Yu, Huiqiang Jiang, Xufang Luo, Qianhui Wu, Chin-Yew Lin, Dongsheng Li, Yuqing Yang, Yongfeng Huang, Lili Qiu

Published: 18 Jun 2024, Last Modified: 16 Jul 2024LCFM 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Position Bias, Long-Context

Abstract: Large Language Models (LLMs) are increasingly applied in various real-world scenarios due to their excellent generalization capabilities and robust generative abilities. However, they exhibit position bias, also known as "lost in the middle", a phenomenon that is especially pronounced in long-context scenarios, which indicates the placement of the key information in different positions of a prompt can signiﬁcantly affect accuracy. This paper ﬁrst explores the micro-level manifestations of position bias, concluding that attention weights are a micro-level expression of position bias. It further identiﬁes that, in addition to position embeddings, causal attention mask also contributes to position bias by creating position-speciﬁc hidden states. Based on these insights, we propose a method to mitigate position bias by scaling this positional hidden states. Experiments on the NaturalQuestions Multi-document QA, KV retrieval, LongBench and timeline reorder tasks, using various models including RoPE models, context window-extended models, and Alibi models, demonstrate the effectiveness and generalizability of our approach. Our method can improve performance by up to 15.2% by modifying just one dimension of hidden states.

Submission Number: 13

Loading