CASRL: Collision Avoidance with Spiking Reinforcement Learning Among Dynamic, Decision-Making Agents

Chengjun Zhang, Ka-Wa Yip, Bo Yang, Zhiyong Zhang, Mengwen Yuan, Rui Yan, Huajin Tang

Published: 2024, Last Modified: 15 May 2025IROS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Developing an efficient collision avoidance policy with Spiking Reinforcement Learning for dynamic, decision-making agents remains challenging. Moreover, the implementation of energy-efficient collision avoidance is important for mobile robots that operate with limited on-board computing resources. Most existing energy-efficient methods via spiking reinforcement learning are predominately concerned with the navigational capabilities of a single agent, and are unable to handle a large, and possibly varying number of agents. To overcome these limitations, we propose a model called collision avoidance with spiking reinforcement learning (CASRL), based on proximal policy optimization algorithms. This proposed model consists of an actor with spiking neural networks (SNNs) and a critic with deep neural networks (DNNs). Our spiking reinforcement learning algorithm is advantageous to handle an arbitrary number of other agents by virtue of a spiking-gated transformer (SpikeGTr) architecture and an accumulate-to-fire (ATF) module. Extensive experimental results demonstrate that CASRL obtains a competitive success rate of navigation and exhibits higher time-efficiency for navigation in crowded scenarios compared to traditional DNN-based methods.