Keywords: Hopfield networks, attention, associative memory, k nearest neighbors
TL;DR: We propose a novel k-Hopfield layer which retrives k-nearest memories to a given input in a differentiable manner
Abstract: Modern continuous Hopfield networks (MCHNs) are a variant of Hopfield networks that have greater storage capacity and have been shown to have connections to the attention mechanism in transformers. In this paper, we propose a variant of MCHNs, which we call k-Hopfield layers, which is the first Hopfield-type network that retrieves the k-nearest memories to a given input. k-Hopfield layers are differentiable and may serve as (i) a soft approach to k-nearest neighbors, (ii) an augmented form of memory in deep learning architectures and (iii) an alternative to multihead attention in transformers. We empirically demonstrate that increasing k aids in correctly reconstructing a corrupted input. We show that using a k-Hopfield layer as a replacement to multihead attention demonstrates comparable performance in small vision transformers while requiring fewer parameters.
Submission Number: 33
Loading