NeuroVIO: SNN Framework for Visual-Inertial Pose Estimation

Published: 18 Sept 2025, Last Modified: 18 Oct 2025EdgeAI4R PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Visual-Inertial Odometry, Spiking Neural Networks, Multimodal Sensor Fusion, Adaptive Leaky-Integrate-and-Fire Neurons, Feature Sparsity, Energy Efficiency
TL;DR: NeuroVIO is a hybrid CNN-SNN-based end-to-end architecture for underwater robots that fuses visual and inertial signals to estimate pose, reducing energy consumption by 80.4% without accuracy loss, enabling efficient, autonomous marine exploration.
Abstract: We present NeuroVIO, a hybrid end-to-end architecture that integrates conventional and spiking neural networks for multimodal visual-inertial odometry in underwater mobile robots. NeuroVIO addresses the need for using energy efficient and accurate pose estimation methods in underwater mobile robots. In our approach, a CNN backbone extracts visual features from successive frames and converts them into time-encoded sequences, which are processed by adaptive leaky-integrate-and-fire neurons with learnable thresholds. Concurrently, inertial measurements are encoded via an SNN feature extractor. Fused features pass through a spike LSTM to capture temporal dependencies, and a spiking regression head predicts the six-dimensional pose vector. Evaluated on the AQUALOC dataset, the proposed NeuroVIO framework reduces the energy consumption by 80.4% relative to its non-spiking counterpart while preserving the pose estimation accuracy. The experimental results demonstrate that integrating neuromorphic paradigms into resource-limited marine robotics platforms enhances the autonomy of underwater robots in exploration tasks.
Submission Type: Novel research
Student Paper: No
Demo Or Video: Yes
Public Extended Abstract: Yes
Submission Number: 4
Loading