Abstract: To deal with the problem of view invariant action recognition, this paper presents a novel approach to recognize human actions across cameras via reconstructable paths. Each action is modelled as a bag of visual-words based on the spatio-temporal features. Although this action representation is sensitive to view changes, the proposed reconstructable path is able to “translate” the action descriptor of one camera view to that of another camera view. In the learning of the paths, a dictionary is learned under each view to transform the action descriptors into a sparsely represented space, and a linear mapping function is simultaneously learned to bridge the semantic gap between the source and target spaces, such that each domain structure can be fully explored and the discrimination among action categories can be well preserved after translation. Along the reconstructable paths, an unknown action from the target view can be precisely reconstructed into any source view, and thus the SVM classifiers trained in source views are able to recognize this unknown action from target view. The proposed approach is tested on the IXMAS data set, and the experimental results achieve improved accuracy about 7% compared to other existing methods, demonstrating its effectiveness for action recognition across cameras.
0 Replies
Loading