Speaker Verification based on Deep Neural Network for Text-Constrained Short Commands

Heesu Kim, Euntae Choi, Kiyoung Choi

Published: 2018, Last Modified: 15 May 2025APSIPA 2018EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Speaker verification has been known to be a tough task especially under the condition of short utterances. Based on the observation that actual voice commands are composed of a few repeated words, we propose an effective approach for building and training a deep neural network to extract features with properties appropriate for tackling such condition. We demonstrate the effectiveness through experiments independently designed for each property. Our proposed approach achieves 5.89% equal error rate on word scale commands shorter than 1 second, and with a linear discriminative analysis, it decreases to 3.43%.