Oops, Wait: Token-Level Signals as a Lens into LLM Reasoning

Oops, Wait: Token-Level Signals as a Lens into LLM Reasoning

ACL ARR 2026 January Submission5709 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Reasoning, Discourse markers, Token-level signal, Distillation, Interpretability, Ensemble

Abstract: The emergence of discourse-like tokens such as ''wait'' and ''therefore'' in large language models (LLMs) has offered a unique window into their reasoning processes. However, systematic analyses of how such signals vary across training strategies and model scales remain lacking. In this paper, we analyze token-level signals through token probabilities across various models. We find that specific tokens strongly correlate with reasoning correctness, varying with training strategies while remaining stable across model scales. A closer look at the ''wait'' token in relation to answer probability demonstrates that models fine-tuned on small-scale datasets acquire reasoning ability through such signals but exploit them only partially. This work provides a systematic lens to observe and understand the dynamics of LLM reasoning.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: counterfactual/contrastive explanations, probing

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 5709

Loading