Unsupervised Filterbank Learning for Speech-based Access System for Agricultural CommodityDownload PDFOpen Website

2017 (modified: 16 Sept 2021)ICAPR 2017Readers: Everyone
Abstract: This paper presents an automatic speech recognition (ASR) system developed as a part of a speech-based access system for an agricultural commodity in the Gujarati language. Speech database was collected from the farmers in the villages of Gujarat state (India) with various dialectal variations and real noisy acoustic environments. We have used the recently proposed Convolutional Restricted Boltzmann Machine (ConvRBM) to learn the filterbank as a front-end. Self-taught learning framework is applied to train Conv RBM using extra Gujarati speech database other than an agricultural commodity. Stochastic data sweeping technique is used to enhance the training speed of ConvRBM. Experiments using time delay deep neural networks (TDNNs) show that ConvRBM features give relative improvements of 5.5% in WER compared to the Mel filterbank features. The system-level combination of both features further improves the performance (3.55 % absolute reduction in WER).
0 Replies

Loading