Token-Wise Kernels (TWiKers) for Vicinity-Aware Attention in Transformers

Token-Wise Kernels (TWiKers) for Vicinity-Aware Attention in Transformers

ACL ARR 2025 May Submission1567 Authors

17 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Self-attention mechanisms in transformers enable tokens to interact across a sequence but lack an explicit inductive bias to capture local contextual dependencies, an inherent characteristic of human languages. We propose Token-Wise Kernels (TWiKers), a novel enhancement to transformers that learn token-specific convolutional kernels applied to the keys or values. Each token is assigned a small kernel, initialized to the "Central Dirac" (e.g., [0,1,0] for size=3), meaning the token "bears" the attention from all other tokens alone. During training, these kernels adapt, and greater deviation from the Central Dirac indicates stronger attention redistribution to neighboring tokens. This introduces the first transformer weights with direct semantic interpretability. Our experiments show that content words (e.g., nouns and verbs) retain self-focus, while function words (e.g., prepositions and conjunctions) shift attention toward their neighbors, aligning with their syntactic and semantic roles. We further apply TWiKers to distinguish literary genres, historical periods, and authors, demonstrating their effectiveness in capturing high-level stylistic patterns. Finally, by allowing them to vary with attention heads, we show the potential of TWiKers as a new inductive bias to enhance transformer training.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: Explanation faithfulness, Probing, Feature attribution, Data influence

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources, Data analysis

Languages Studied: English

Keywords: Self-attention mechanisms, Causal language modeling, Interpretability, Lexical relationships

Submission Number: 1567

Loading