The Cue or not the Cue? A Mechanistic Study of Memory Mechanisms in RNNs

Published: 23 Sept 2025, Last Modified: 27 Nov 2025NeurReps 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Recurrent Neural Networks, Computational Neuroscience, Memory mechanisms
TL;DR: A simple interpretability approach reveals that RNNs trained on delayed cue discrimination rely on short-term memory, preserving raw inputs rather than engaging in working-memory–like transformations.
Abstract: Neural networks can solve behavioral tasks requiring memory either by remembering the full content or through active manipulation that retains a simplified version. Yet, distinguishing between these two memory retention mechanisms in recurrent neural networks (RNN) remains underexplored. To bridge this gap, we studied RNNs performing delayed cue discrimination (DCD) tasks and asked whether they retain raw continuous-valued input cues or their task-relevant binary representations. Using linear probes trained on neural activities during the delay period, we tested whether RNNs eventually collapse the retained cue values into compact, binary representations. Even though RNNs were trained only using binary cues, we consistently observed high reconstruction fidelity of continuous cue inputs across diverse experimental conditions and learned memory mechanisms. Overall, our results provide evidence that RNNs can find solutions preserving the contents of past memories with high fidelity, favoring representational completeness over efficiency, even when not demanded by the task.
Submission Number: 119
Loading