Abstract: Speaker diarization refers to identifying who speaks what in a conversation. It is critical in sensitive settings like psychological counseling and legal consultations. However, traditional approaches, such as microphone or video, raise privacy concerns and cause discomfort to participants due to their noticeable deployment. To address this, we propose a non-intrusive speaker diarization system via mmWave sensing. Our approach leverages the spatial diversity of signals from multiple objects to distinguish speakers. Specifically, it isolates speech-induced vibrating objects signals and extracts speaker-related features through a two-stage feature extraction process. Our system achieves over 93% accuracy in real-world scenarios, demonstrating its effectiveness in reliably distinguishing speakers.
Loading