Training-Free Spectral Fingerprints of Voice Processing in Transformers

Training-Free Spectral Fingerprints of Voice Processing in Transformers

ICLR 2026 Conference Submission25528 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: transformer interpretability, graph signal processing, attention analysis, cross-linguistic analysis, spectral connectivity, voice alternation, tokenizer effects, Fiedler eigenvalue

TL;DR: Graph signal processing on attention reveals model-family specific shifts in algebraic connectivity (Fiedler value) for voice alternation across 20 languages, aligning with tokenization effects, behavioral fit, and head-ablation evidence.

Abstract: Different transformer architectures implement identical linguistic computations via distinct connectivity patterns, yielding model imprinted ``computational fingerprints'' detectable through spectral analysis. Using graph signal processing on attention induced token graphs, we track changes in algebraic connectivity (Fiedler value, $\Delta\lambda_2$) under voice alternation across 20 languages and three model families, with a prespecified early window (layers 2--5). Our analysis uncovers clear architectural signatures: Phi-3-Mini shows a dramatic English specific early layer disruption ($\overline{\Delta\lambda_2}_{[2,5]} \approx -0.446$) while effects in 19 other languages are minimal, consistent with public documentation that positions the model primarily for English use. Qwen2.5-7B displays small, distributed shifts that are largest for morphologically rich languages, and LLaMA-3.2-1B exhibits systematic but muted responses. These spectral signatures correlate strongly with behavioral differences (Phi-3: $r=-0.976$) and are modulated by targeted attention head ablations, linking the effect to early attention structure and confirming functional relevance. Taken together, the findings are consistent with the view that training emphasis can leave detectable computational imprints: specialized processing strategies that manifest as measurable connectivity patterns during syntactic transformations. Beyond voice alternation, the framework differentiates reasoning modes, indicating utility as a simple, training free diagnostic for revealing architectural biases and supporting model reliability analysis.

Primary Area: interpretability and explainable AI

Submission Number: 25528

Loading