Abstract: Speech carries a wealth of sensitive information, and many studies have investigated various methods of speech eavesdropping. Recent research has shown that even in soundproof indoor environments, speech systems can still be compromised by outdoor RF sensing technologies. However, existing approaches either rely heavily on prior knowledge of the target environment, fail when the primary sound source is occluded, or are limited to word-level classification. Consequently, they have not fully exposed the severe threats that RF sensing poses to speech privacy. To bridge these gaps, this paper presents Argus-ear, a mmWave-based speech eavesdropping system. By localizing and identifying sound sources throughout the target room, Argus-ear captures subtle vibrations from the most eavesdropping-worthy reflectors and reconstructs speech using deep neural networks. A series of techniques are proposed to enhance weak vibration sensing, suppress noise interference, and improve the fidelity of speech reconstruction. Extensive experiments demonstrate that Argus-ear can identify various types of sound sources and accurately reconstruct unconstrained vocabulary-level speech across different languages, speakers, and sound sources.
Loading