Gesture MNIST: A New Free-Hand Gesture Dataset

Monika Schak, Alexander Gepperth

Published: 01 Jan 2022, Last Modified: 10 Nov 2023ICANN (4) 2022Readers: Everyone

Abstract: We present a unimodal, comprehensive, and easy-to-use dataset for visual free-hand gesture recognition. We call it GestureMNIST because of the 28 $$\times $$ 28 grayscale format of its images, and because the number of samples is approximately 80,000, similar to MNIST. Each of the six gesture classes is composed of a sequence of 12 images taken by a 3D camera. As a peculiarity w.r.t. other datasets, all sequences are recorded by a single person, ensuring high sample uniformity and quality. A particular focus is to provide a vision-based dataset that can be used “out of the box” for sequence classification without any preprocessing, segmentation, and feature extraction steps. We present classification experiments on GestureMNIST with different types of DNNs, establishing a performance baseline for sequence classification algorithms. We place particular emphasis on ahead-of-time classification, i.e., the correct identification of a gestures class before the gesture is completed. It is shown that CNN and LSTM-based deep learning achieves near-perfect performance, whereas ahead-of-time classification performance offers ample scope for future research with GestureMNIST. GestureMNIST contains visual samples only, but other modalities, namely acceleration and sound data, are available upon request.

0 Replies