Semi-Autonomous Data Enrichment and Optimisation for Intelligent Speech Analysis

Published: 01 Jan 2015, Last Modified: 08 Apr 2025undefined 2015EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Intelligent Speech Analysis (ISA) plays an essential role in smart conversational agent systems that aim to enable natural, intuitive, and friendly human-computer interaction. It includes not only the long-term developed Automatic Speech Recognition (ASR), but also the young field of Computational Paralinguistics, which has attracted increasing attention in recent years. In real-world applications, however, several challenging issues surrounding data quantity and quality arise. For example, predefined databases for most paralinguistic tasks are normally quite small and few in number, which are insufficient for building a robust model. A distributed structure could be useful for data collection, but original feature sets are always too large to meet the physical transmission requirements, for example, bandwidth limitation. Furthermore, in a hands-free application scenario, reverberation severely distorts speech signals, which results in performance degradation of recognisers.
Loading