Abstract: Self-attention is a cornerstone of modern deep learning, yet its dense dot-product formulation offers limited interpretability and lacks explicit structural constraints. We propose SVD-inspired Attention (SVDA), a novel self-attention mechanism that introduces normalized query/key projections and a learnable diagonal spectral modulation, drawing direct motivation from the structure of Singular Value Decomposition (SVD). This formulation separates directional alignment from spectral emphasis, offering a geometrically grounded and interpretable variant of attention. We formalize SVDA within a standard multi-head Transformer architecture and introduce a suite of structure-aware indicators—such as spectral entropy, effective rank, and selectivity—that quantify interpretability and sparsity in attention dynamics. Our analysis highlights SVDA’s capacity for structured, energy-aware attention without compromising architectural compatibility or expressiveness. This work provides a theoretical foundation and diagnostic framework for structured attention models aimed at interpretability, compression, and semantic transparency.
External IDs:doi:10.1109/access.2025.3586739
Loading