Learnable Higher-order Representation for Action Recognition

Kai Hu; Bhiksha Raj

Learnable Higher-order Representation for Action Recognition

Kai Hu, Bhiksha Raj

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

TL;DR: Proposed higher order operation for context learning

Abstract: Capturing spatiotemporal dynamics is an essential topic in video recognition. In this paper, we present learnable higher-order operation as a generic family of building blocks for capturing higher-order correlations from high dimensional input video space. We prove that several successful architectures for visual classification tasks are in the family of higher-order neural networks, theoretical and experimental analysis demonstrates their underlying mechanism is higher-order. On the task of video recognition, even using RGB only without fine-tuning with other video datasets, our higher-order models can achieve results on par with or better than the existing state-of-the-art methods on both Something-Something (V1 and V2) and Charades datasets.

Keywords: action recognition, context learning

Original Pdf: pdf

5 Replies

Loading