Abstract: Highlights • We study the task of recognizing the class of an ongoing action from video stream. • Theoretical and empirical proof of the importance of knowing action starting point. • A novel method to estimate the start of the ongoing action is proposed. • Experiments on three datasets show the effectiveness of the proposed method. Abstract We address the task of recognizing the category of an ongoing human action from a video stream. This task is challenging because of the need to output categorization decisions based on partial evidence—the action has not finished and not all information about the action has been observed. This task is further complicated because the ongoing action is submerged in the stream of data and the start of the action is not given. Existing methods for early recognition usually ignore this issue, making unrealistic assumption about the availability of the starting point of the ongoing action. In this paper, we prove the importance of starting point detection and subsequently propose a method to determine the start of an ongoing action. Our method is based on a bidirectional recurrent neural network that computes the probability of a frame to be the starting point by comparing the dynamics of the actions before and after the frame. Experiments on three datasets show that our method can reliably detect the starting point of an ongoing action, improving the early recognition accuracy. Graphical abstract Download : Download high-res image (256KB) Download : Download full-size image Previous article in issue Next article in issue MSC 41A05 41A10 65D05 65D17
0 Replies
Loading