The NGT200 Dataset - Geometric Multi-View Isolated Sign Recognition

Published: 17 Jun 2024, Last Modified: 12 Jul 2024ICML 2024 Workshop GRaMEveryoneRevisionsBibTeXCC BY 4.0
Track: Proceedings
Keywords: Geometric Deep Learning, Sign Language Recognition, Isolated Sign Recognition, Dataset, NGT200, Multi-view
TL;DR: Addressing multi-view isolated sign language recognition with geometrically grounded models.
Abstract: Sign Language Processing (SLP) provides the foundation for a more inclusive future in language technology; however, the field must first overcome significant challenges. This work addresses multi-view isolated sign recognition (MV-ISR), emphasizing the critical importance of 3D awareness for real-world SLP applications. We introduce a new benchmark for MV-ISR, the NGT200 dataset, and define MV-ISR as a distinct task from single-view ISR. We showcase the benefits of including synthetic data, and propose to condition sign representations on the spatial symmetries inherent to the visual modality of sign language. We enhance MV-ISR performance by 8%-22% using a geometrically grounded model compared to the SL-GCN baseline.
Submission Number: 58
Loading