Spatial-temporal pooling for action recognition in videos

Jiaming Wang; Zhenfeng Shao; Xiao Huang; Tao Lu; Ruiqian Zhang; Xianwei Lv

Spatial-temporal pooling for action recognition in videos

Jiaming Wang, Zhenfeng Shao, Xiao Huang, Tao Lu, Ruiqian Zhang, Xianwei Lv

Published: 01 Jan 2021, Last Modified: 14 Nov 2024Neurocomputing 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We propose an end-to-end approach with a novel temporal-spatial pooling block (named STP) for action classification, which can learn pool discriminative frames and pixels in a certain clip. Our method achieves better performance than other state-of-the-art methods.•We propose a STP loss function, aiming to learn a sparse importance score in the temporal dimension, abandoning the redundant or invalid frames.•We present a ferryboat video database (named Ferryboat-4) for ferry action recognition. The database includes four action categories: Inshore, Offshore, Traffic, and Negative. We evaluate proposed STP and other state-of-the-art models on this database.

Loading