Associative Memories with Heavy-Tailed Data

Vivien Cabannes; Elvis Dohmatob; Alberto Bietti

Associative Memories with Heavy-Tailed Data

Vivien Cabannes, Elvis Dohmatob, Alberto Bietti

Published: 27 Oct 2023, Last Modified: 26 Nov 2023AMHN23 PosterEveryoneRevisionsBibTeX

Keywords: Mechanistic interpretability; Optimization-based memorization; heavy-tailed data; Zipf law; LLMs

TL;DR: Scaling laws for associative memories to better understand optimization-based memorization

Abstract: Learning arguably involves the discovery and memorization of abstract rules. But how associative memories appear in transformer architectures optimized with gradient descent algorithms? We derive precise scaling laws for a simple input-output associative memory model with respect to parameter size, and discuss the statistical efficiency of different estimators, including optimization-based algorithms. We provide extensive numerical experiments to validate and interpret theoretical results, including fine-grained visualizations of the stored memory associations.

Submission Number: 3

Loading