SCOPE: Boosting LLM Efficiency with Scoped Position Encoding

ACL ARR 2026 January Submission10843 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Efficient Transformers, Positional Encoding, Long-Context Modeling, Structured Sparsity, Length Extrapolation, Attention Mechanism
Abstract: Positional encodings are fundamental to Transformers, yet explicit methods like RoPE often incur high computational overhead and struggle with length extrapolation. In this paper, we propose \textbf{Sco}ped \textbf{P}osition \textbf{E}ncoding (\textbf{ScoPE}), a novel framework that reimagines structured sparsity as an intrinsic position encoding mechanism. Instead of relying on explicit arithmetic signals, ScoPE assigns exponentially distributed look-back scopes to attention heads. We theoretically demonstrate that this simple topological constraint transforms the model into a hierarchical processor, inducing exponential Order Awareness (OA) with network depth. Consequently, ScoPE is parameter-free and avoids the resolution decay typical of explicit methods. Empirically, it significantly enhances efficiency by masking the majority of attention computations—offering a theoretical $8\times$ reduction in FLOPs. Extensive evaluations on LLaMA-3-8B architectures reveal that ScoPE achieves superior native length extrapolation and robust retrieval fidelity compared to RoPE, all while substantially reducing training and inference latency.
Paper Type: Long
Research Area: LLM Efficiency
Research Area Keywords: Language Modeling, Efficient/Low-Resource Methods for NLP
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency, Theory
Languages Studied: English
Submission Number: 10843
Loading