Sliding Critical Band in RoPE-based Length Extrapolation

Published: 03 Mar 2026, Last Modified: 07 Apr 2026ICLR 2026 DeLTa Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Rotary Position Embedding, Context Extrapolation, Transformers
TL;DR: We proposed a unified framework from a dynamic perspective to explain the extrapolation mechanism of RoPE-based models.
Abstract: Context extension in RoPE-based Large Language Models has become a primary focus in the development of RoPE-based models. In this paper, we introduce Sliding Critical Band, a framework demonstrating that the dimensions requiring interpolation dynamically migrate across the spectrum under different extrapolation ratios. Building on this, we proposed Spectrum Bandwidth Exhaustion, which provides an explanation of why larger RoPE bases can enhance models' extrapolation ability. Together, these two concepts offer a more comprehensive understanding of the principles underlying context extrapolation in RoPE-based models. Evaluations on synthetic tasks and the C4 dataset validate the universality of the Sliding Critical Band across diverse scenarios.
Submission Number: 78
Loading