KeypointNet: An Efficient Deep Learning Model with Multi-View Recognition Capability for Sitting Posture Recognition

Zheng Cao, Xuan Wu, Chunguo Wu, Shuyang Jiao, Yubin Xiao, Yu Zhang, You Zhou

Published: 12 Feb 2025, Last Modified: 27 Jan 2026ElectronicsEveryoneRevisionsCC BY-SA 4.0
Abstract: Numerous studies leverage pose estimation to extract human keypoint data and then classify sitting postures. However, employing neural networks for direct keypoint classification often yields suboptimal results. Alternatively, modeling keypoints into other data representations before classification introduces redundant information and substantially increases inference time. In addition, most existing methods perform well only under a single fixed viewpoint, limiting their applicability in complex real-world scenarios involving unseen viewpoints. To better address the first limitation, we propose KeypointNet, which employs a decoupled feature extraction strategy consisting of a Keypoint Feature Extraction module and a Multi-Scale Feature Extraction module. In addition, to enhance multi-view recognition capability, we propose the Multi-View Simulation (MVS) algorithm, which augments the viewpoint information by first rotating keypoints and then repositioning the camera. Simultaneously, we propose the multi-view sitting posture (MVSP) dataset, designed to simulate diverse real-world viewpoints. The experimental results demonstrate that KeypointNet outperforms the other state-of-the-art methods on both the proposed MVSP dataset and the other public datasets, while maintaining a lightweight and efficient design. Ablation studies demonstrate the effectiveness of MVS and all KeypointNet modules. Furthermore, additional experiments highlight the superior generalization, small-sample learning capability, and robustness to unseen viewpoints of KeypointNet.
Loading