Abstract: The open nature of voice input makes voice assistant (VA) systems vulnerable to various acoustic attacks (e.g., replay and voice synthesis attacks). A simple yet effective way for adversaries to launch these attacks is to hide behind barriers (e.g., a wall, a window, or a door) and give unauthorized voice commands without being observed by legitimate users. In this work, we develop an automated, training-free defense system that can protect VA systems from such thru-barrier acoustic attacks. Our study finds that acoustic signals passing through the barriers generally present a unique frequency-selective effect in the vibration domain. Thus, we propose to devise a system to capture this unique effect of barriers by leveraging low-cost, cross-domain sensing available in users’ wearables. The system replays the audio-domain signals with the wearable’s speaker and captures the conductive vibrations caused by the audio sounds in the vibration domain via the built-in accelerometer. To improve the proposed system’s reliability, we develop a unique vibration-domain enhancement method to extract the phonemes most sensitive to the frequency-selective effect of barriers. We identify effective vibration-domain features that capture the barriers’ effects in the vibration domain. A 2D-correlation-based method is developed to examine the speech similarity between the recordings from the VA system and the user’s wearable and detect thru-barrier attacks. Extensive experiments with various barriers and environments demonstrate that the proposed defense system can effectively defend random, replay, synthesis, and hidden voice attacks with less than 4% equal error rates.
0 Replies
Loading