Abstract: We study the sign language recognition problem which is to translate the meaning of signs from visual input such as videos. It is well-known that many problems in the field of computer vision require a huge amount of dataset to train deep neural network models. We introduce the KETI sign language dataset which consists of 10,480 videos of high resolution and quality. Since different sign languages are used in different countries, the KETI sign language dataset can be the starting line for further research on the Korean sign language recognition.Using the sign language dataset, we develop a sign language recognition system by utilizing the human keypoints extracted from face, hand, and body parts. The extracted human keypoint vector is standardized by the mean and standard deviation of the keypoints and used as input to recurrent neural network (RNN). We show that our sign recognition system is robust even when the size of training data is not sufficient. Our system shows 89.5% classification accuracy for 100 sentences that can be used in emergency situations.
Loading