Keywords: Associative memory, transformer attention, iterative inference, Hopfield networks, attractor dynamics, MNIST embeddings, retrieval accuracy
TL;DR: Iteratively applying transformer attention at inference uncovers latent associative memory, improving recall from partial or noisy cues in toy experiments.
Abstract: Associative memory has re-emerged as a useful lens for understanding modern attention-based architectures. In this preliminary study, we conduct controlled experiments on synthetic Gaussian vectors and MNIST embeddings to investigate whether repeated application of transformer attention can improve recall from partial or noisy cues. We observe that iterative attention improves retrieval accuracy from partial or noisy cues by 13-16% and exhibits stable convergence behavior consistent with attractor-like dynamics. These results
suggest that inference-time iterations can uncover latent associative behavior in attention-based models, though further evaluation on larger and more complex datasets is needed.
Submission Number: 3
Loading