Abstract: The lack of fine-grained joints (facial joints, hand fingers) is a
fundamental performance bottleneck for state of the art skeleton
action recognition models. Despite this bottleneck, community’s
efforts seem to be invested only in coming up with novel architectures. To specifically address this bottleneck, we introduce two
new pose based human action datasets - NTU60-X and NTU120-X.
Our datasets extend the largest existing action recognition dataset,
NTU-RGBD. In addition to the 25 body joints for each skeleton as in
NTU-RGBD, NTU60-X and NTU120-X dataset includes finger and
facial joints, enabling a richer skeleton representation. We appropriately modify the state of the art approaches to enable training using
the introduced datasets. Our results demonstrate the effectiveness
of these NTU-X datasets in overcoming the aforementioned bottleneck and improve state of the art performance, overall and on
previously worst performing action categories. Code and pretrained
models can be found at https://github.com/skelemoa/ntu-x.
0 Replies
Loading