Articulation Motion Sensing for Pronunciation Training

Aslan B. Wong, Xia Chen, Qianru Liao, Kaishun Wu

Published: 01 Jan 2021, Last Modified: 08 May 2025SECON 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The vowel is deemed the essence of the syllable in which controls the articulation of each word uttered. However, articulation sensing has not been adequately evaluated. The challenging task is that the speech signal contains insufficient information for articulation analysis. We propose a new approach to identify the articulation of monophthongs in multiple languages. We employ simultaneously two ranges of acoustic signals, both speech and ultrasonic signal, to recognize lip shape and tongue position, which is implemented into an off-the-shelf smartphone to be more accessible. The articulation recognition accuracy is 94.74%. The proposed system also applies to an alternative model for a pronunciation training system that gives articulation feedback to a user.