Speaker identification with deep neural networks Download PDF

Jul 20, 2019RIIAA 2019 Conference SubmissionReaders: Everyone
  • Keywords: speaker identification, neural networks, deep neural networks, convolutional neural networks, text independent
  • TL;DR: Text independent speaker identification through deep convolutional neural networks achieving a 93% accuracy
  • Abstract: There is a great research effort in looking for medical solutions for Alzheimer’s disease, while significantly less in creating solutions for care-giving post diagnosis. The motivation of this project is to provide patients with a tool to recognize familiar people in common situations. The solution implements a text independent speaker recognition system using deep learning. This paper compares the performance of a convolutional neural network (CNN) against a fully connected neural network to address the speaker identification problem. The CNN includes 1-dimension convolutional layers, max pooling layers, batch normalization, regularization techniques and a SoftMax output. The models are trained and tested in the freely available VCTK Corpus Data Set for 109 speakers. Our results show that the CNN surpasses the fully connected network with an accuracy of 93.05% compared to 75.88%.
0 Replies