Abstract: Observing a video stream and being able to predict target events of interest before they
occur is an important but challenging task due to the stochastic nature of visual events.
This task requires a classifier that can separate the precursory signals that lead to the events
and the ones that do not. However, a naïve approach for training this classifier would
require seeing many examples of the target events before a model with high precision
can be obtained. In this paper, we propose a method for early prediction of visual events
based on an ensemble of exemplar predictors. Each exemplar predictor is associated
with an instance of the target event, being trained to separate the target event from
negative samples. The exemplar predictors can be calibrated and integrated to create a
stronger predictor. Experiments on several datasets show that the proposed exemplar-based
framework outperforms other methods, yielding higher precision given fewer training
samples. Our code and datasets can be found at github.com/cvlab-stonybrook/EnEx.
0 Replies
Loading