Top attention in line with time: A light-weight strategyDownload PDFOpen Website

2017 (modified: 07 Apr 2022)ICME 2017Readers: Everyone
Abstract: For video representation, dense sampling along trajectories or optical flow stacking are both heavy-cost computations. This paper aims to develop a light-weight strategy which could skip the computations of optical flow and trajectories. Particularly, taking frames as inputs to a pre-trained ConvNet, we extract top layers as video feature maps. Instead of trajectory pooling, we directly pooled these feature maps in line with time, which is named Line Pooling. We utilize the proposed Line-pooled Deep-convolutional Descriptors (LDDs) to weight regions with high motion saliency, which turns out to pay attention to actions in line with time. Experiments on UCF101 and HMDB51 demonstrate the efficiency, effectiveness, and promising performance of our method.
0 Replies

Loading