Abstract: With the heated trend of augmented reality (AR) and popularity of smart head-mounted devices, the development of natural human device interaction is important, especially the hand gesture based interaction. This paper presents a solution for the point gesture based interaction in the egocentric vision and its application. Firstly, a dataset named EgoFinger is established focusing on the pointing gesture for the egocentric vision. We discuss the dataset collection detail and as well the comprehensive analysis of this dataset, including background and foreground color distribution, hand occurrence likelihood, scale and pointing angle distribution of hand and finger, and the manual labeling error analysis. The analysis shows that the dataset covers substantial data samples in various environments and dynamic hand shapes. Furthermore, we propose a two-stage Faster R-CNN based hand detection and dual-target fingertip detection framework. Comparing with state-of-art tracking and detection algorithm, it performs the best in both hand and fingertip detection. With the large-scale dataset, we achieve fingertip detection error at about 12.22 pixels in 640px × 480px video frame. Finally, using the fingertip detection result, we design and implement an input system for the egocentric vision, i.e., Ego-Air-Writing. By considering the fingertip as a pen, the user with wearable glass can write character in the air and interact with system using simple hand gestures.
0 Replies
Loading