Power-law feature statistics explain test reconstruction gaps in Associative Memories
Keywords: Associative Memories, Hopfield Networks, Generalization, Data Structure
TL;DR: Power-law feature statistics provide a principled explanation for the test reconstruction gap, suggesting that the random feature structure is a good model of data for self-supervised tasks.
Abstract: Associative memories have been recently shown to be able to generalize, that is to produce attractors near previously unseen examples. While this phenomenon is understood in the synthetic setting of random-features examples as the exploitation of mixed spurious states, it is unclear whether the same explanation extends to real datasets, where the emergent attractors do not coincide perfectly with test examples (a test reconstruction gap is visible). In this work, we introduce a more natural model of data where random features are sampled with power-law occurrence, showing that this change produces many benign effects for generalization, including an implicit form of regularization. Overall, this new data structure provides an interpretation of the test reconstruction gap that is consistent with the known mechanism for generalization based on mixed spurious states.
Submission Number: 47
Loading