Abstract: Complex human events are high-level human activities that are composed of a set of interacting primitive human
actions over time. Complex human event recognition is important for many applications, including security
surveillance, healthcare, sports and games. Complex human event recognition requires recognizing not only
the constituent primitive actions but also, more importantly, their long range spatiotemporal interactions. To
meet this requirement, we propose to exploit the self-attention mechanism in the Transformer to model and
capture the long-range interactions among primitive actions. We further extend the conventional Transformer
to a probabilistic Transformer in order to quantify the event recognition confidence and to detect anomaly
events. Specifically, given a sequence of human 3D skeletons, the proposed model first performs primitive action
localization and recognition. The recognized primitive human actions and their features are then fed into the
probabilistic Transformer for complex human event recognition. By using a probabilistic attention score, the
probabilistic Transformer can not only recognize complex events but also quantify its prediction uncertainty.
Using the prediction uncertainty, we further propose to detect anomaly events in an unsupervised manner. We
evaluate the proposed probabilistic Transformer on FineDiving dataset and Olympics Sports dataset for both
complex event recognition and abnormal event detection. The dataset consists of complex events composed of
primitive diving actions. The experimental results demonstrate the effectiveness and superiority of our method
against baseline methods.
0 Replies
Loading