Improving Sign Gesture Recognition Using Bi-LSTM Attention Model with Optimal Point Cloud Frame Range

Published: 01 Jan 2024, Last Modified: 12 Jun 2025SCIS/ISIS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: For individuals who are hard of hearing and not proficient in sign language, assistance is vital for understanding and communication. This research aims to support them using machine learning models to recognize sign language gestures and translate them into texts. This study employs a millimeter wave (mmWave) radar sensor due to its non-invasive nature, privacy protection, and insensitive to illumination and weather to capture frames of a sequence of sign gestures performed. Those frames captured at specific time steps contain point cloud data. Subsequently, the frame range representing the temporal information of each gesture performed, initially varying between [11–120], was optimized to [30–54], improving LSTM model accuracy from 88.3% to 89.2% while reducing gesture samples from 800 to 700 roughly. Two additional models, Bi-LSTM and Bi-LSTM with Attention, were constructed to improve accuracy. Their accuracies improved from 88.7% and 89.86% to 91.7% and 92.3% after frame range optimization.
Loading