Enhancing Human Pose Estimation with SE-Block in the OmniPose Model

Khac-Anh Phu, Van-Dung Hoang, Van-Tuong-Lan Le, Thinh Vinh Le

Published: 2024, Last Modified: 19 Jun 2025HSI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The interaction and communication between humans and computers have brought diversity and richness to the field of computer vision research, offering numerous potentials and challenges in developing human action recognition applications. In this domain, recognizing human actions from image or video data plays a crucial role in various practical applications, from security surveillance to interactive control. Although the OmniPose model has demonstrated its effectiveness, there is still potential to improve its performance. In the scope of this study, we focus on enhancing the OmniPose model, which is used for extracting skeleton data from input image data. We propose two improvement methods: utilizing the Self-Attention mechanism and employing Squeeze-and-Excitation to enhance the skeleton data extraction capability of the OmniPose model. Through this approach, we aim to contribute to enhancing the performance of the OmniPose model in skeleton data extraction and human pose recognition, while opening doors to advancements in human action recognition in computer vision.