ReSLT: Retrieval-enhanced Sign Language Translation with LLMs

ACL ARR 2025 May Submission7165 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Gloss-free Sign Language Translation (SLT) aims to directly translate visual expressions into spoken language, bypassing intermediate gloss annotations. Recent studies have demonstrated remarkable performance by leveraging Large Language Models (LLMs) in gloss-free SLT. However, existing approaches often fail to fully exploit the potential of LLMs due to simplistic prompt design. To address this gap, we propose ReSLT, a Retrieval-Augmented Generation SLT framework that utilizes pre-existing linguistic knowledge to enable LLMs to effectively comprehend sign languages. ReSLT incorporates a semantic prompting strategy, aligning video and text embeddings to construct context-aware prompts. Additionally, the proposed framework maintains a lightweight structure, allowing for easy integration with other SLT models, thus enhancing the applicability of LLMs in SLT. Our experiments demonstrate that even with the simplest architecture, ReSLT achieves performance gains in Korean Sign Language and German Sign Language, highlighting its effectiveness and scalability.
Paper Type: Short
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: LLMs, Multimodal, RAG, Sign Language Translation
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English, German, Korean
Submission Number: 7165
Loading