Same Meaning, Different Tokens: Tokenization-Induced Shifts in Representations and Predictions

Published: 24 Apr 2026, Last Modified: 24 Apr 2026CauScale 2026EveryoneRevisionsCC BY 4.0
Keywords: Tokenizers, Internal representations
Abstract: Tokenization, the process where an input string is segmented into a token sequence, is the initial step to many language modeling architectures. It has the power to define the space in which the entire model will operate. Despite its importance, its effects on the internal representations within large language models is not fully understood. This work builds towards understanding such effects. We first study the similarity of hidden layer representations in models with different tokenizers but otherwise identically trained. In addition we study the effects of semantic preserving perturbations on output distribution and surprisal. We show that although meaning is preserved, changes in tokenization can lead to shifts in hidden layer representation, output distribution, and surprisal. We identify tokenization as an early, discrete choice that can systematically shape how interventions on surface form translate into changes in internal representations and next-token predictions.
Submission Number: 22
Loading