Learning Spatio-Temporal Representations With Temporal Squeeze Pooling

Guoxi Huang, Adrian G. Bors

2020 (modified: 26 Apr 2023)ICASSP 2020Readers: Everyone

Abstract: In this paper, we propose a new video representation learn¬ing method, named Temporal Squeeze (TS) pooling, which can extract the essential movement information from a long sequence of video frames and map it into a set of few im¬ages, named Squeezed Images. By embedding the Tempo¬ral Squeeze pooling as a layer into off-the-shelf Convolution Neural Networks (CNN), we design anew video classification model, named Temporal Squeeze Network (TeSNet). The re¬sulting Squeezed Images contain the essential movement in¬formation from the video frames, corresponding to the op¬timization of the video classification task. We evaluate our architecture on two video classification benchmarks, and the results achieved are compared to the state-of-the-art.

0 Replies