Pose-Based Isolated Sign Language Recognition with Semantic Mapping and CorrFormer

Published: 2025, Last Modified: 28 Dec 2025CSCWD 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Pose-based isolated sign language recognition (ISLR) demonstrates strong resilience to background noise and maintains computational efficiency. Existing methods typically use raw pose data, which are sensitive to camera angles and positioning, leading to reduced recognition accuracy. Additionally, they often fail to track inter-frame trajectories essential for accurate sign interpretation. To address these limitations, we propose an end-to-end ISLR framework incorporating sign language semantic mapping and CorrFormer: the proposed sign language semantic mapping leverages intra-frame spatial invariance of human pose, enhancing stability, while CorrFormer tracks adjacent temporal inter-frame trajectories for more precise recognition. We evaluated our framework on WLASL and AUTSL, with results and ablation studies confirming its efficacy, achieving Top-1 accuracies of 67.72%, 47.51 %, and 85.74% on WLASL100, WLASL300, and AUTSL, respectively.
Loading