SAFE: Spiking Neural Network-based Audio Fidelity Evaluation

Published: 01 Sept 2025, Last Modified: 18 Nov 2025ACML 2025 Conference TrackEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Recent advances in generative AI have enabled the creation of highly realistic synthetic audio, which poses significant challenges in voice authentication, media verification, and fraud detection. While Artificial Neural Networks (ANNs) are frequently used for fake audio detection, they often struggle to generalize to unseen and complex manipulations, particularly partial fake audio, where real and synthetic segments are seamlessly combined. This paper explores the use of Spiking Neural Networks (SNNs) for fake and partial fake audio detection – an unexplored area. Taking advantage of the inherent energy efficiency and temporal processing capabilities of SNNs, we propose novel SNN-based architectures for both tasks. We perform comprehensive evaluations that include hyperparameter tuning, cross-data set generalization, noise robustness, and partial fake audio detection using multiple large-scale public audio datasets. Our results show that SNNs achieve performance comparable to state-of-the-art ANN models while showing better generalization capabilities and robustness to noise. These SNN-based approaches also resulted in additional advantages, such as reduced model sizes and the ability to classify individual segments, making them more suitable for resource-constrained and real-time voice authentication applications. This work lays a foundation for exploring SNNs as countermeasures against audio spoofing in security-critical applications.
Submission Number: 231
Loading