Combining long and short spatiotemporal reasoning for deep reinforcement learning

Published: 01 Jan 2025, Last Modified: 15 May 2025Neurocomputing 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Improving sample efficiency in deep reinforcement learning is a crucial challenge in sequential decision-making. Finding a rich representation learning method for sequential samples is an urgent need for visual reinforcement learning. Previous research shows that computational complexity or over-parameterization prevents agents from learning long-term spatiotemporal properties, which leads to a substantial decrease in sample usage. Considering these challenges, we provide an online reinforcement learning framework that integrates both short- and long-term dependencies to effectively describe spatiotemporal properties.We propose a method, which we name Spatio-Temporal Reasoning and Memory (STRM), that aims at reconciling these relationships. Specifically, the short-term spatiotemporal feature module extracts local spatiotemporal features and their relationships using a 3D convolutional neural network combined with a self-attention mechanism. In contrast, the long-term spatiotemporal feature module employs a Transformer-based external memory network, using the short-term spatiotemporal features as input. Ultimately, a comprehensive state representation is produced by combining these various aspects. The efficacy of our framework has been empirically validated, exhibiting markedly better performance than previous approaches.
Loading