Spiking Decision Making Bottleneck for Offline Reinforcement Learning With Spiking Neural Networks

Spiking Decision Making Bottleneck for Offline Reinforcement Learning With Spiking Neural Networks

ICLR 2026 Conference Submission17980 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Spiking Neural Networks, Spike-based Training, Brain-inspired Computing

Abstract: Spiking Neural Networks (SNNs), with their event-driven low-power characteristics, provide key technological support in energy-constrained embodied intelligence applications, particularly in offline Reinforcement Learning (RL) tasks. However, offline RL, which relies solely on precollected data for policy training and cannot interact with the environment in real time, is limited by the inherent redundancy in offline data. This limits the model's ability to learn compact and generalizable representations, leading to degraded policy performance and reduced robustness. To address this issue, we propose the Spiking Decision Making Bottleneck (SDMB), a novel information compression framework designed for offline RL based on SNNs. The framework aims to guide the network in learning abstract and relevant trajectory representations for efficient policy learning. Specifically, it minimizes the mutual information between the input and latent representations, thereby suppressing input redundancy and promoting sparse, decision-relevant activations. To prevent over-compression and the consequent loss of critical behavioral information, SDMB further incorporates the principle of maximum entropy to ensure sufficient informational diversity is preserved during policy optimization. Experimental results on D4RL benchmark tasks validate the effectiveness of SDMB in extracting key spiking features in offline RL settings. Compared to both SNNs and Artificial Neural Networks (ANNs) methods,the performance of SDMB surpasses the state-of-the-art and achieves lower energy consumption, demonstrating dual advantages in energy efficiency and strategy generalization.

Primary Area: applications to neuroscience & cognitive science

Submission Number: 17980

Loading