Spatio-Temporal Multi-scale Soft Quantization Learning for Skeleton-Based Human Action Recognition

Jianyu Yang, Chen Zhu, Junsong Yuan

2019 (modified: 17 Sept 2021)ICME 2019Readers: Everyone

Abstract: Effective feature representation is important for action recognition. In this paper, a novel soft quantization learning method is proposed to represent visual features for action recognition. Specifically, we propose a dual multi-scale soft-quantization network, which is a trainable quantizer using RBF neurons. The RBF layer includes dual multi-scale structure, namely a three-level hierarchical skeleton structure in space, and a temporal-pyramid based multi-scale time structure. Different spatial levels in the RBF layer have respective RBF neurons for hierarchical spatial information, while the temporal scales share them to reduce the number of parameters in the network. An accumulation layer following the RBF layer summarizes the RBF output as a histogram representation for classification task. The proposed method is end-to-end differentiable that can be trained using regular back-propagation. The conducted experiments on benchmark datasets verify that the proposed method outperforms state-of-the-art methods.

0 Replies