Abstract: Memory reconstruction is considered a generative process regulated by the interplay between episodic memory and semantic information. However, few computational models have investigated this process in detail, and existing models have some limitations. In this study we develop and analyze a computational model that complements episodic memory with semantic information, looking into how attention affects the recall process in this integrated model. We aim to enhance and expand upon a computational model proposed by recent research [2], which has employed a Vector Quantized-Variational Autoencoder (VQ-VAE) as a model of the perceptual system and a PixelCNN architecture for semantic completions during memory reconstruction. While capable of generating plausible images and filling in missing parts of a memory trace, the model has limitations in attentional selection due to the rigid structure of PixelCNNs, as well as constraints in image resolution and complexity. In this work, we address these limitations and further investigate how attentional selection affects memory accuracy and generativity. First, we substitute the PixelCNN with a Transformer model (semantic memory) to capture underlying probability distributions from latent representations of the VQ-VAE (episodic memory). The transformer model allows a flexible attentional selection as opposed to the PixelCNN. We further utilize a hierarchical VQ-VAE, which resembles the hierarchical organization of the visual cortex. This hierarchical network allows the generation of more complex and realistic images. We also provide insights into the division of labor between two levels of the hierarchical VQ-VAE. Our simulations also illustrate the effects of different levels and forms of attention on memory consolidation and reconstruction.
Loading