LiveProbe: Exploring Continuous Voice Liveness Detection via Phonemic Energy Response Patterns

Published: 01 Jan 2023, Last Modified: 13 May 2025IEEE Internet Things J. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Voice assistants support contactless smart device control and thus act as a holy grail of human–computer interaction. However, recent studies reveal that an adversary can manipulate devices by vicious voice commands. This security risk is caused by only executing one-time liveness detection and lacking safeguard modules after service activation. Therefore, identifying speaker type (i.e., human articulators or loudspeakers) is critical in protecting voice-driven services during an entire interaction session. In this article, we propose a continuous voice liveness detection approach LiveProbe, leveraging unique energy response patterns in frequency bands induced by distinct voice generation mechanisms. The rationality behind LiveProbe is presented in two aspects: human articulator reshapes initial voices by exquisitely coordinated movements of vocal organs, which act as band-pass filters generating unique energy responses; nevertheless, the internal modules of loudspeakers are position fixed and cannot reproduce this response characteristic. To that end, we first work on voice generation mechanisms behind two-type speakers that cause spectrum differences. Then, we elaborately construct signal processing and deep-learning modules to extract liveness features. Especially, our approach does not interfere with normal voice interaction and need not to carry customized sensors. The experiment presents its effectiveness against potential attacks with a false acceptance rate of 0.51%.
Loading