A real-time human-robot interaction framework with robust background invariant hand gesture detection

Osama Mazhar

17 May 2020OpenReview Archive Direct UploadReaders: Everyone

Abstract: In the light of factories of the future, to ensure productive and safe interaction between robot and human coworkers, it is imperative that the robot extracts the essential information of the coworker. We address this by designing a reliable framework for real-time safe human-robot collaboration, using static hand gestures and 3D skeleton extraction. OpenPose library is integrated with Microsoft Kinect V2, to obtain a 3D estimation of the human skeleton. With the help of 10 volunteers, we recorded an image dataset of alpha-numeric static hand gestures, taken from the American Sign Language. We named our dataset OpenSign and released it to the community for benchmarking. Inception V3 convolutional neural network is adapted and trained to detect the hand gestures. To augment the data for training the hand gesture detector, we use OpenPose to localize the hands in the dataset images and segment the backgrounds of hand images, by exploiting the Kinect V2 depth map. Then, the backgrounds are substituted with random patterns and indoor architecture templates. Fine-tuning of Inception V3 is performed in three phases, to achieve validation accuracy of 99.1% and test accuracy of 98.9%. An asynchronous integration of image acquisition and hand gesture detection is performed to ensure real-time detection of hand gestures. Finally, the proposed framework is integrated in our physical human-robot interaction library OpenPHRI. This integration complements OpenPHRI by providing successful implementation of the ISO/TS 15066 safety standards for “safety rated monitored stop” and “speed and separation monitoring” collaborative modes. We validate the performance of the proposed framework through a complete teaching by demonstration experiment with a robotic manipulator.

0 Replies