Abstract: Visual memory schemas (VMS) capture the regions of scene images that cause that scene to be remembered, providing a two-dimensional memorability map that indicates the parts of a given scene that match to mental schemas held in the mind. Despite the advantage of determining which parts of an image lead to remembering said image, VMS prediction capabilities lag behind those of single-score memorability. Compared with predicting single-score ratings for the likelihood of a person remembering an image, VMS prediction is a significantly harder task, due to increased computational complexity, minimal model development compared with single score, and lack of relevant data. In this work, we aim to improve methods for two-dimensional memorability prediction. We first significantly increase the size of a database containing VMS maps obtained from participants in a scene memorization experiment, and then we develop an architecture that leverages existing single-score image memorability datasets to predict VMS maps. Our final model, dual-feedback VMS (DF-VMS) significantly outperforms existing VMS prediction models, with a performance increase of 11.8%. Additionally, we explore the semantic structures that are actually captured by visual memory schemas, determining the combination of scene elements that lead to remembering that scene.
External IDs:doi:10.1109/tcds.2025.3533112
Loading