Abstract: Optical image stabilization (OIS), powered by a special micro-electromechanical structure in the camera lenses to compensate for the optical distortion caused by camera shakes, has become an indispensable feature in many smartphones. However, we discover that this seemingly benign component can be exploited to eavesdrop on nearby audio signals, posing a significant threat to people's privacy during conversations or phone calls. Specifically, the OIS component can be influenced by external acoustic stimuli leading to slight vibrations, and at the same time, the coil and magnetized components inside the OIS induce electromagnetic leakage as they vibrate, according to Faraday's Law of Electromagnetic Induction. This electro-magnetic leakage contains voice information that can be used to recover the audio signals if intercepted by individuals with malicious intent. Inspired by the above discovery, we propose OISMic, a new acoustic eavesdropping attack that takes advantage of sound-induced OIS vibrations on smartphones. Unlike other existing acoustic eavesdropping attacks, eavesdropping exploiting OIS vibrations not only overcomes the constraints imposed by system permissions for many sensor-based approaches but is also immune to ultrasonic jammer that hinders the methods relying on microwave or light reflections to sense sound-induced vibrations. To execute this non-trivial attack in practical scenarios, we developed a prototype circuit that has a compact design capable of capturing the electromagnetic leakage caused by OIS vibrations. After converting the collected leaked electromagnetic signals into audio signals, a software-based phase-locked loop (PLL) method is developed to enhance the representation of voice components. Meanwhile, to reconstruct the weak audio signals, we also designed a diffusion-based neural network to learn the distribution of electromagnetic noise within the audio spectrum. Extensive experiments indicate that OISMic can accurately reconstruct voice under various scenarios, achieving an average word correct rate of 90.57 % across different devices.
Loading