Multiperson Activity Recognition and Tracking Based on Skeletal Keypoint Detection

Hai-Sheng Li, Jing-Yin Chen, Haiying Xia

Published: 2024, Last Modified: 13 Nov 2024IEEE Trans. Artif. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Currently, most action recognition networks have deep overall structures, large model parameters, and high requirements for computer hardware equipment. As a result, it is easy to overfit in the recognition process for too deep network layers. Furthermore, it is also difficult to extract features because of the video's interference information, such as illumination and occlusion. To solve the above problems, we propose a multiperson action recognition and tracking algorithm based on skeletal keypoint detection. First, the n network combining the improved dense convolutional network and part affinity field is used to extract the skeletal information points of the human body. Then, we present an improved DeepSort network for multiperson target tracking, which contains a Hungarian matching algorithm based on the generalized intersection over union and a pedestrian reidentification network combining GhostNet and feature pyramid network. Finally, we construct a deep neural network model to classify the extracted human skeletal information and realize action recognition. Experimental results show that the multiperson action recognition and tracking algorithm achieves an action recognition accuracy of 98%. In addition, the multitarget tracking accuracy of the proposed algorithm is improved by 4.2% on the MOT16 dataset. Compared with other common algorithms, the proposed algorithm can achieve high accuracy in detecting keypoints of the human body and improve the accuracy of multiperson action recognition with fewer parameters and complexity of operations.