3DSA: Multi-view 3D Human Pose Estimation With 3D Space Attention Mechanisms

Bo-Han Chen, Chia-Chi Tsai

Published: 01 Jan 2024, Last Modified: 05 Mar 2025ECCV (27) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this study, we introduce the 3D space attention module (3DSA) as a novel approach to address the drawback of multi-view 3D human pose estimation methods, which fail to recognize the object’s significance from diverse viewpoints. Specifically, we utilize the 3D space subdivision algorithm to divide the feature volume into multiple regions. Predicted 3D space attention scores are assigned to the different regions to construct the feature volume with space attention. The purpose of the 3D space attention module is to distinguish the significance of individual regions within the feature volume by applying weighted attention adjustments derived from corresponding viewpoints. We conduct experiments on existing voxel-based methods, VoxelPose and Faster VoxelPose. By incorporating the space attention module, both achieve state-of-the-art performance on the CMU Panoptic Studio dataset.