Complex Activity Recognition Via Attribute Dynamics

Weixin Li, Nuno Vasconcelos

07 Jan 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: The problem of modeling the dynamic structure of human activities is considered. Video is mapped to a semantic feature space, which encodes activity attribute probabilities over time. The binary dynamic system (BDS) model is proposed to jointly learn the distribution and dynamics of activities in this space. This is a non-linear dynamic system that combines binary observation variables and a hidden Gauss-Markov state process, extending both binary principal component analysis (PCA) and the classical linear dynamic systems (LDS). A BDS learning algorithm, inspired by the popular dynamic texture, and a dissimilarity measure between BDSs, which generalizes the Binet-Cauchy kernel, are introduced. To enable the recognition of highly non-stationary activities, the BDS is embedded in a bag of words. An algorithm is introduced for learning a BDS codebook, enabling the use of the BDS as a visual word for attribute dynamics (WAD). Short-term video segments are then quantized with a WAD codebook, allowing the representation of video as a bagof-words for attribute dynamics (BoWAD). Video sequences are finally encoded as vectors of locally aggregated descriptors (VLAD), which summarize the firstmoments of video snippets on the BDS manifold. Experiments show that this representation achieves stateof-the-art performance on the tasks of complex activity recognition and event identification.

0 Replies