Not Lost After All: How Cross-Encoder Attribution Challenges Position Bias Assumptions in LLM Summarization

ACL ARR 2025 May Submission6955 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Position bias—where Large Language Models (LLMs) overrepresent content from the beginnings and endings of documents while neglecting middle sections—has been considered a core limitation in automatic summarization. To measure position bias, prior studies rely heavily on n-gram matching techniques, which fail to capture semantic relationships in abstractive summaries where content is extensively rephrased. To address this limitation, we introduce a cross-encoder-based alignment method that jointly processes summary–source sentence pairs, enabling more accurate identification of semantic correspondences—even when summaries substantially rewrite the source. Experiments with five LLMs across six summarization datasets reveal markedly different position bias patterns than those reported by traditional metrics. Our findings suggest that these biases primarily reflect rational adaptations to document structure and content rather than true model limitations. Through controlled experiments and analyses across varying document lengths and multi-document settings, we show that LLMs utilize content from all positions more effectively than previously assumed, challenging common claims about “lost-in-the-middle” behaviour.
Paper Type: Long
Research Area: Summarization
Research Area Keywords: abstractive summarisation, architectures evaluation, factuality long-form summarization, multi-document summarization
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
Submission Number: 6955
Loading