Attention Localization Through Separator Tokens: Unlocking Long Numerical Sequence Processing in LLMs

19 Sept 2025 (modified: 06 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model Applications, Numerical Sequence Processing
Abstract: Despite possessing massive context windows, Large Language Models (LLMs) exhibit a sharp decline in performance when processing long numerical sequences, a critical failure for precision-sensitive applications. We identify the root cause as the models' inability to focus attention on a manageable sequence segment, leading to dispersed attention and inaccurate results. To address this, we introduce \textbf{Sep}arate \textbf{N}umerical \textbf{S}equences (SepNS), a training-free inference framework that guides LLMs by strategically inserting separators into numerical inputs. This simple modification encourages a ``separate and focus'' strategy, which we verify through attention analysis showing that separators induce localized focus on distinct segments. Extensive experiments on nine high-performance LLMs show SepNS substantially boosts accuracy, achieving average gains of \textbf{35.6\%} across all evaluated datasets with less overhead. Our work demonstrates that simple, structured input formatting acts as a powerful attention-focusing mechanism, unlocking long numerical processing capabilities in LLMs without any retraining.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 18733
Loading