Top-$n\sigma$: Eliminating Noise in Logit Space for Robust Token Sampling of LLM

ACL ARR 2025 February Submission1757 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) rely heavily on sampling methods to generate diverse and high-quality text. While existing sampling methods like top-$p$ and min-$p$ have identified the detrimental effects of low-probability tails in LLMs' outputs, they still fail to effectively distinguish between diversity and noise. This limitation stems from their reliance on probability-based metrics that are inherently sensitive to temperature scaling. Through empirical and theoretical analysis, we make two key discoveries: (1) the pre-softmax logits exhibit a clear statistical separation between informative tokens and noise, and (2) we prove the mathematical equivalence of min-$p$ and top-(1-$p$) under uniform distribution over logits. These findings motivate the design of top-n$\sigma$, a novel sampling method that identifies informative tokens by eliminating noise directly in logit space. Unlike existing methods that become unstable at high temperatures, top-$n\sigma$ achieves temperature-invariant token selection while preserving output diversity. Extensive experiments across reasoning and creative writing tasks demonstrate that our method consistently outperforms existing approaches, with particularly significant improvements in high-temperature settings.
Paper Type: Long
Research Area: Generation
Research Area Keywords: inference methods, text-to-text generation
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 1757
Loading