Combined Static and Motion Features for Deep-Networks-Based Activity Recognition in Videos

Sameera Ramasinghe, Jathushan Rajasegaran, Vinoj Jayasundara, Kanchana Ranasinghe, Ranga Rodrigo, Ajith A. Pasqual

2019 (modified: 15 Nov 2022)IEEE Trans. Circuits Syst. Video Technol. 2019Readers: Everyone

Abstract: Activity recognition in videos in a deep-learning setting-or otherwise-uses both static and pre-computed motion components. The method of combining the two components, while keeping the burden on the deep network less, still remains uninvestigated. Moreover, it is not clear what the level of contribution of individual components is, and how to control the contribution. In this paper, we use a combination of convolutional-neural-network-generated static features and motion features in the form of motion tubes. We propose three schemas for combining static and motion components: based on a variance ratio, principal components, and Cholesky decomposition. The Cholesky-decomposition-based method allows the control of contributions. The ratio given by variance analysis of static and motion features matches well with the experimental optimal ratio used in the Cholesky decomposition-based method. The resulting activity recognition system is better or on par with the existing state-of-the-art when tested with three popular data sets. The findings also enable us to characterize a data set with respect to its richness in motion information.

0 Replies