Resolution Irrelevant Encoding and Difficulty Balanced Loss Based Network Independent Supervision for Multi-Person Pose Estimation

Abstract: Sustainable efforts are made to improve the accuracy performance in multi-person pose estimation, but the current accuracy is still not enough for real-world applications. Besides, most improvement approaches are designed for special basement networks and ignore the speed performance, which results in limited applicability and low cost-performance. This paper proposes two network independent supervision: Resolution Irrelevant Encoding and Difficulty Balanced Loss. The proposed methods reorganize task representatives, the loss calculation method, and the loss punishment ratio in one-stage pose estimation frameworks to improve the joints' location accuracy with general applicability and high computational efficiency. Resolution Irrelevant Encoding fuses heatmaps and proposed inner block offsets to fix pixel-level joints positions without resolution limitations. To improve network training efficiency, Difficulty Balanced Loss adjusts loss weight in spatial and sequential aspects. On the MS COCO keypoints detection benchmark, the mAP of OpenPose trained with our proposals outperforms the OpenPose baseline over 4.9%.
0 Replies
Loading