Anchors First, Reason Later: A Layer-Wise View of Long-Context Processing

Anchors First, Reason Later: A Layer-Wise View of Long-Context Processing

ACL ARR 2026 January Submission6287 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, long context, RoPE

Abstract: Extending the context length of large language models (LLMs) remains challenging, especially when models are expected to preserve reasoning performance as sequence length increases. Many existing methods extend context by modifying rotary positional embeddings (RoPE). However, these approaches typically impose the same positional treatment across all layers and do not account for the hierarchical nature of representation formation inside the model. We present an Anchor-and-Reason view of long-context processing that emphasizes layer-wise functional differences. Specifically, we posit two regimes. In earlier layers, the model primarily performs an anchoring operation: accurate and sufficiently strong positional signals help organize long input sequences and support the formation of local semantic representations. In later layers, the model increasingly shifts to reasoning: it integrates intermediate representations to support global composition and deduction, where overly rigid positional constraints can become limiting. Based on this perspective, we propose Layer-Scaling for Position (LASP), a simple layer-dependent adjustment of positional strength. LASP maintains higher-frequency positional components in shallow layers to stabilize sequence-to-semantics mapping, while progressively reducing positional intensity in deeper layers via exponential decay, allowing higher layers to operate with fewer positional restrictions. Experiments on a range of long-context benchmarks show that LASP yields consistent improvements over strong baselines.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: continual learning,fine-tuning

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 6287

Loading