Finger Spelling Recognition from Depth Data Using Direction Cosines and Histogram of Cumulative Magnitudes

Abstract: In this paper, we propose a new approach for finger spelling recognition using depth information captured by Kinect sensor. We only use depth information to characterize hand configurations corresponding to alphabet letters. First, we use depth data to generate a binary hand mask which is used to segment the hand area from background. Then, the major hand axis is determined and aligned with Y axis in order to achieve rotation invariance. Later, we convert the depth data in a 3D point cloud. The point cloud is divided into sub regions and in each one, using direction cosines, we calculated three histograms of cumulative magnitudes Hx, Hy and Hz corresponding to each axis. Finally, these histograms were concatenated and used as input to our Support Vector Machine (SVM) classifier. The performance of this approach is quantitatively and qualitatively evaluated on a dataset of real images of American Sign Language (ASL) hand shapes. The dataset used is composed of 60000 depth images. According to our experiments, our approach has an accuracy rate of 99.37%, outperforming other state-of-the-art methods.
0 Replies
Loading