Functional Equivalence in Attention: A Comprehensive Study with Applications to Linear Mode Connectivity
Keywords: functional equivalence, attention mechanism, positional encoding
TL;DR: This work studies the functional equivalence in Transformers with positional encodings
Abstract: The parameter space of neural networks serves as a surrogate for the underlying function class; however, the mapping is inherently non-injective, as revealed by functional equivalence, wherein distinct parameter configurations yield identical input-output behaviors. While this phenomenon has been analyzed in classical architectures such as fully connected and convolutional networks, the increasing complexity of modern designs, particularly attention-based models, presents new and significant challenges. Prior analyses of multihead attention have been largely restricted to the vanilla formulation, thereby neglecting crucial components such as positional encodings that fundamentally alter architectural symmetries and render earlier results inapplicable. In this work, we undertake a formal study of functional equivalence in Transformers with positional encodings. Focusing on the two most widely used variants--sinusoidal and rotary--we demonstrate that sinusoidal encodings preserve the equivalence structure of vanilla attention, whereas rotary encodings significantly reduce the associated symmetry group, thereby enhancing expressivity. This theoretical insight offers a principled explanation for the growing prominence of RoPE in practice. Furthermore, we extend our analysis to investigate how positional encodings influence the phenomenon of linear mode connectivity (LMC). By introducing an alignment algorithm, we empirically validate the presence and variability of LMC across a wide range of Transformer configurations, datasets, and modalities, demonstrating that the type of positional encoding plays a decisive role in shaping the connectivity of solutions.
Supplementary Material: zip
Primary Area: learning theory
Submission Number: 5096
Loading