Spatial-temporal pooling for action recognition in videos

Published: 01 Jan 2021, Last Modified: 14 Nov 2024Neurocomputing 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose an end-to-end approach with a novel temporal-spatial pooling block (named STP) for action classification, which can learn pool discriminative frames and pixels in a certain clip. Our method achieves better performance than other state-of-the-art methods.•We propose a STP loss function, aiming to learn a sparse importance score in the temporal dimension, abandoning the redundant or invalid frames.•We present a ferryboat video database (named Ferryboat-4) for ferry action recognition. The database includes four action categories: Inshore, Offshore, Traffic, and Negative. We evaluate proposed STP and other state-of-the-art models on this database.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview