Synergistic Multi-Modal Keystroke Eavesdropping in Virtual Reality With Vision and Wi-Fi

Jiachun Li, Yan Meng, Fazhong Liu, Tian Dong, Suguo Du, Guoxing Chen, Yuling Chen, Haojin Zhu

Published: 01 Jan 2025, Last Modified: 08 Nov 2025IEEE Transactions on Information Forensics and SecurityEveryoneRevisionsCC BY-SA 4.0
Abstract: In panoramic and immersive virtual reality (VR) scenarios, users type on a floating and invisible keyboard, which cannot be observed by external adversaries, creating the illusion that their input is confidential. While recent studies have demonstrated the feasibility of leveraging side-channel information (e.g., vision, Wi-Fi) to eavesdrop on keystrokes in VR, they assume users typically type with fixed gestures, similar to using traditional physical keyboards. However, in real world scenarios, VR creates a 3D immersive environment, allowing users to type from varying orientations. This variation significantly degrades the quality of side-channel information (e.g., occlusion in vision, instability in Wi-Fi channels), leading to ineffective inference. In this study, we propose a multi-modal keystroke eavesdropping attack called WiViLeak, which combines Wi-Fi and vision information to complement each other. To address low-quality side-channel data caused by users’ varying orientations, we develop a theoretical model to explore the relationship between users’ hand movements in physical space (from the vision modality) and fluctuating Wi-Fi signals (from the wireless modality) as users change orientation. Based on this, we design a fully transformer based orientation calibration module to recover users’ vision data, aligning it as if they were facing the camera (i.e., in a front-facing view). Meanwhile, WiViLeak reconstructs Wi-Fi data to correspond to the front-facing view, utilizing the orientation angle derived from vision data. Finally, WiViLeak extracts effective features from reconstructed, high-quality vision and Wi-Fi data to predict keystrokes. We implement a WiViLeak prototype, achieving 89.2% accuracy in eavesdropping keystrokes and 93.6% top-100 password theft accuracy, while also demonstrating robustness across various real world VR scenarios, including payments, chatting, and meetings.
Loading