Understanding Attention Glitches with Threshold Relative Attention

Mattia Opper; Roland Fernandez; Paul Smolensky; Jianfeng Gao

Understanding Attention Glitches with Threshold Relative Attention

Mattia Opper, Roland Fernandez, Paul Smolensky, Jianfeng Gao

Published: 10 Jun 2025, Last Modified: 15 Jul 2025MOSS@ICML2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: length generalisation, attention glitches, flip-flops, algorithmic reasoning

TL;DR: We create a novel attention mechanism to address some limitations of standard self-attention and apply it to the flip-flop language modeling task

Abstract: Transformers struggle with generalisation, displaying poor performance even on basic yet fundamental tasks, such as flip-flop language modeling. We test whether these limitations can be explained through two key failures of self-attention. The first is the inability to fully remove irrelevant information. The second concerns position, even when a key is completely irrelevant learned positional biases may unintentionally up-weight it - dangerous when distances fall out of distribution. To probe this we propose TRA, a novel attention mechanism with which we demonstrate that these issues underlie generalisation failures on the flip-flop task.

Code: ipynb

Submission Number: 96

Loading