Human activity prediction based on Sub-volume Relationship Descriptor

Dong-Gyu Lee, Seong-Whan Lee

2016 (modified: 05 Nov 2022)ICPR 2016Readers: Everyone

Abstract: In this paper, we address the problem of recognizing unfinished human activity from partially observed videos. Specifically, we propose a novel human activity descriptor, which can represent pairwise relationships among human activities in a compact manner using pre-trained Convolutional Neural Networks (CNNs) by capturing the discriminative sub-volume. The potentially important relationship among all pairwise sub-volumes, called key-volumes, is automatically captured using global and local motion activation and the ratio of the participant. The captured key-volumes without prior knowledge hold discriminative information related to the unfinished activity. The key-volume information is considered in the descriptor construction procedure. Training a CNN model for a particular purpose requires a lot of resources, such as large amount of labeled data and computing power, despite its representational power. Thus, we develop a method to utilize pre-trained CNN without any additional model training procedure. The low-level features can be extracted through existing CNN toolkits. For a real application, the proposed method may be more cost-effective while implementing a smart surveillance system to understand human activity. In our experiments, we compare the performances of the proposed method with other state-of-the-art human activity prediction methods for two public datasets; the results of the experiments show that the proposed method outperforms these competing methods.

0 Replies