A review of deep learning-based approaches to sign language processing

Sihan Tan, Nabeela Khan, Zhaoyi An, Yoshitaka Ando, Rei Kawakami, Kazuhiro Nakadai

Published: 01 Jan 2024, Last Modified: 15 Jul 2025Adv. Robotics 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Technology to support human communication by sign language may address a growing social need and is interesting from an engineering perspective, considering multimodal information processing with potential applications in robotics. Recent advances in deep learning and generative AI technologies have markedly improved the performance of image and natural language processing. This paper surveys sign language recognition, translation, and generation, dedicating a section to research using large language models, which represents a novel approach to sign language processing (SLP). We also review currently available datasets, focusing on their applicability to SLP. Key findings include demonstrating the limitations of gloss-based approaches in capturing non-verbal cues and the wide variability across datasets, which impede the development of robust SLP systems. Additionally, we identify inconsistencies in evaluation metrics, emphasizing the need for standardized approaches that account for the nuances of both sign and spoken languages. Finally, we evaluate existing datasets, assessing their relevance and potential to advance sign language processing research.