Rotary Positional Embeddings as Phase Modulation: Theoretical Bounds on the RoPE Base for Long-Context Transformers
Abstract: Rotary positional embeddings (RoPE) are widely used in large language models to encode
token positions through multiplicative rotations, yet their behavior at long context lengths
remains poorly characterized. In this work, we reinterpret RoPE as phase modulation
appliedto abank of complexoscillators, enabling analysis throughclassical signal processing
theory.
Under this formulation, we derive principled lower bounds on the RoPE base parameter that
are necessary to preserve positional coherence over a target context length. These include
a fundamental aliasing bound, analogous to a Nyquist limit, and a DC-component stability
bound that constrains phase drift in low-frequency positional modes. We further extend
this analysis to deep transformers, showing that repeated rotary modulation across layers
compounds angular misalignment, tightening the base requirement as depth increases.
Complementing these results, we derive a precision-dependent upper bound on the RoPE
base arising from finite floating-point resolution. Beyond this limit, incremental phase up-
datesbecomenumericallyindistinguishable, leadingtopositionalerasureevenintheabsence
of aliasing. Together, the lower and upper bounds define a precision- and depth-dependent
feasibility region—a “Goldilocks zone”—for long-context transformers.
We validate the framework through a comprehensive case study of state-of-the-art models,
includingLLaMA,Mistral, andDeepSeekvariants, showingthatobservedsuccesses, failures,
and community retrofits align closely with the predicted bounds. Notably, models that
violate the stability bound exhibit attention collapse and long-range degradation, while
attempts to scale beyond one million tokens encounter a hard precision wall independent of
architecture or training.
Our analysis establishes RoPE base selection as a fundamental necessary architectural con-
straint, ratherthanatunablehyperparameter, andprovidespracticalguidancefordesigning,
scaling, and retrofitting long-context transformers under realistic numerical limits.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Pierre_Ablin2
Submission Number: 7458
Loading