Keywords: Self-Attention, Clusters, Transformers, Inductive bias, Representation geometry, Model diagnostics
TL;DR: A training-free method using attention's innate clustering reveals how architectural choices affect model performance.
Abstract: We introduce a parameter-free framework to isolate the self-attention mechanism, stripping away all learned parameters. Through iterative application, we demonstrate that self-attention alone intrinsically drives the formation of semantically meaningful clusters in the representation space. Analyzing this behavior across global, local-window, and hybrid patterns reveals their inherent geometric biases independent of training. Crucially, we find that query scaling (as used in Longformer) induces an implicit dimensionality reduction that systematically improves model generalization, a insight we validate experimentally. This geometric bias is consistent across both low-dimensional data and high-dimensional real-world representations. Probing a pre-trained model confirms this is architecturally inherent and further refined by learning. Our work provides a useful diagnostic tool for evaluating attention architectures prior to training.
Primary Area: interpretability and explainable AI
Supplementary Material: zip
Submission Number: 5040
Loading