Hopfield Encoding Networks

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Hopfield Encoded Networks, Cross stimuli Content Associative Memories, CLIP, encoder-decoder architectures
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We study the limitations of modern hopfield networks and demonstrate how encoded hopfield networks show better performance in avoiding meta stable states and shifting away from the paradigm of giving content to get content.
Abstract: Content-associative memories such as Hopfield networks have been studied as a good mathematical model of the auto-associative features in the CA3 region of the hippocampal memory system. Modern Hopfield networks (MHN) are generalizations of the classical Hopfield networks with revised energy functions and update rules to expand storage to exponential capacity. However, they are not yet practical due to spurious metastable states even while storing a small number of input patterns. Further, they have only been able to demonstrate recall of content by giving partial content in the same stimulus domain and don't adequately explain how cross-stimulus associations can be accomplished, as is evidenced in the Hippocampal formation. In this paper, we revisit Modern Hopfield networks from both these perspectives to offer new insights and extend the MHN model to mitigate these limitations. Specifically, we observe that the spurious states relate to the separability of the input patterns, which can be enhanced by encoding them before storage and decoding them after recall. We introduce a new kind of Modern Hopfield network called the Hopfield Encoding Network (HEN) to enable this and show that such a model can support cross-stimulus associations, particularly between vision and language, to enable recall of memories with associative encoded textual patterns.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4143
Loading