Abstract: This paper describes a method for the detection, tracking and recognition of lower arm and hand movements from color video sequences using a linguistic approach driven by motion analysis and clustering techniques. The novelty of our method comes from (i) automatic arm detection, without any manual initialization, foreground or background modeling, (ii) gesture representation at different levels of abstraction using a linguistic approach based on signal-to-symbol mapping, and (iii) robust matching for gesture recognition using the weighted largest common sequence (of symbols). Learning vector quantization abstracts the affine motion parameters as morphological primitive units, i.e. "letters"; clustering techniques derive sequences of letters as "words" for both sub-activities and the transitions occurring between them; and, finally, the arm activities are recognized in terms of sequences of certain sub-activities. Using activity cycles from six kinds of arm movements, i.e. slow and fast pounding, striking, swinging, swirling and stirring, which were not available during training, the performance achieved is perfect (100%) if one allows, as should be the case for invariance purposes, slow and fast pounding video sequences to be recognized as one and the same type of activity.
Loading