Abstract: Gloss is a written approximation that bridges Sign Language (SL) and its corresponding spoken language. Despite a deaf and hard-of-hearing population of 1.7 million, Bangla Sign Language (BdSL) remains largely understudied, with no prior work on Bangla text-to-gloss translation and no publicly accessible datasets. To address this gap, we construct a dataset of Bangla sentences and their gloss representations, adapting rule-based glossing methods from German and American Sign Languages to fit BdSL. We further augment the dataset using GPT-4o, along-side back-translation and text generation tech-niques. We fine-tune pretrained mBART-large-50 (hereafter, mBART) and mBERT-multilingual-uncased models, and train traditional baselines including RNN, GRU, and a novel seq-to-seq model with multi-head attention. Fine-tuning mBART achieves the best performance (sacre-BLEU = 79.53). We hypothesize that mBART’s training on shuffled and masked text aligns well with the inherently non-linear structure of gloss. Testing this on the PHOENIX-14T bench-mark confirms our hypothesis, where mBART achieves State-of-the-Art results across six metrics, including sacreBLEU = 63.89 and COMET = 0.624. Our work introduces the first Bangla text-to-gloss framework and highlights the effectiveness of rule-based synthetic data in tackling low-resource sign language translation. Our study presents a novel approach to Bangla text-to-gloss translation using mBART and demonstrates the value of rule-based syn-thetic data in addressing low-resource sign lan-guage translation challenges.
Paper Type: Long
Research Area: Phonology, Morphology and Word Segmentation
Research Area Keywords: Morphological Analysis, Morphological Segmentation, Fine-tuning
Contribution Types: Approaches to low-resource settings
Languages Studied: Bengali, English
Submission Number: 1428
Loading