Abstract: Hand pose estimation in depth images is a challenging problem for human-computer interaction. In this paper, we propose a novel approach for hand pose estimation that shares the merits of both deep learning based hand segmentation and dynamics based pose optimization. For hand segmentation, we propose `Dynamic Projected Segmentation Networks' applied at depth images, providing a pixel-wise classification result. To preserve the detailed hand-region topology structure, we design a dynamic projection based hand-region extraction method to crop the hand region from depth images. The projected hand-region is then fed into a light-weight `Encoder-Decoder Networks' for segmentation. For pose optimization, we employ rigid body dynamics to estimate the final pose based on the segmentation results which are treated as hand geometry constraints. Experiments show that our approach outperforms state-of-the-art methods on two challenging datasets.
Loading