Keywords: Reasoning models, intermediate tokens, post-training, reasoning traces
TL;DR: Anthropomorphizing intermediate tokens as reasoning/thinking traces is (1) wishful (2) has little concrete supporting evidence (3) engenders false confidence in the final results to end users and (4) is pushing the community into dubious directions.
Abstract: Intermediate token generation (ITG), where a model produces output before the solution, has been proposed as a method to improve the performance of language models on reasoning tasks. These intermediate tokens have been called "reasoning traces" or even "thoughts" -- implicitly anthropomorphizing the model, implying these tokens resemble steps a human might take when solving a challenging problem. In this position paper, we present evidence that this anthropomorphization isn't a harmless metaphor, and instead is quite dangerous -- it confuses the nature of these models and how to use them effectively, and leads to questionable research.
Submission Type: Position/Review Paper (4-9 Pages)
Submission Number: 26
Loading