Abstract: A complex action consists of a sequence of atomic
actions that interact with each other over a relatively
long period of time. This paper introduces a probabilistic model named Uncertainty-Guided Probabilistic Transformer (UGPT) for complex action recognition. The self-attention mechanism of a Transformer is used to capture the
complex and long-term dynamics of the complex actions. By
explicitly modeling the distribution of the attention scores,
we extend the deterministic Transformer to a probabilistic
Transformer in order to quantify the uncertainty of the prediction. The model prediction uncertainty is used to improve both training and inference. Specifically, we propose
a novel training strategy by introducing a majority model
and a minority model based on the epistemic uncertainty.
During the inference, the prediction is jointly made by both
models through a dynamic fusion strategy. Our method is
validated on the benchmark datasets, including Breakfast
Actions, MultiTHUMOS, and Charades. The experiment results show that our model achieves the state-of-the-art performance under both sufficient and insufficient data.
0 Replies
Loading