Abstract: In this paper, we propose a new approach for dynamic hand gesture recognition using intensity, depth and skeleton joint data captured by Kinect <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TM</sup> sensor. The proposed approach integrates global and local information of a dynamic gesture. First, we represent the skeleton 3D trajectory in spherical coordinates. Then, we extract the key frames corresponding to the points with more angular and distance difference. In each key frame, we calculate the spherical distance from the hands, wrists and elbows to the shoulder center, also we record the hands position changes to obtain the global information. Finally, we segment the hands and use SIFT descriptor on intensity and depth data. Then, Bag of Visual Words (BOW) approach is used to extract local information. The system was tested with the ChaLearn 2013 gesture dataset and our own Brazilian Sign Language dataset, achieving an accuracy of 88.39% and 98.28%, respectively.
0 Replies
Loading