CESLR: A Multi-Signer Benchmark and SpatioTemporal End-to-End Framework for Continuous Ethiopian Sign Language Recognition
Keywords: Ethiopian Sign Language; Sign Language Recognition; Deep Learning; Continuous SLR, EthSL
Abstract: Continuous Ethiopian Sign Language Recognition (CESLR) remains an under explored challenge due to the language’s rich visual dynamics and variability among signers. This work introduces CESLR, the first large-scale, multi-signer video corpus for continuous EthSL recognition, consisting of 1,320 sentence recordings performed twice by 22 participants. To process these visual sequences, we design an end-to-end deep learning architecture that jointly models spatial
and temporal cues through 2D convolutional feature extraction,
1D temporal convolution, and bidirectional recurrent encoding.
A Connectionist Temporal Classification (CTC) layer enables
sequence alignment without frame-level annotation, allowing
the network to learn directly from sentence-level supervision.
Experimental evaluation shows that the system attains a Word
Error Rate (WER) of 8.82% in signer-independent testing and
47.02% on unseen sentence evaluation, revealing strong cross-
signer generalization while highlighting the difficulty of novel
sentence recognition. The proposed dataset and model establish
a foundational benchmark for advancing automatic EthSL
translation and assistive communication technologies in
multilingual settings.
Submission Number: 42
Loading