GrammaMT: Improving Machine Translation with Grammar-Informed In-Context Learning

ACL ARR 2024 June Submission5162 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We introduce GrammaMT, a grammatically-aware prompting approach for machine translation that uses Interlinear Glossed Text (IGT), a common form of linguistic annotation describing lexical and functional morphemes of source sentences. GrammaMT proposes two prompting strategies: gloss-shot and chain-gloss. Both are training-free, require only a few examples, and involve minimal effort to collect, making them well-suited for low-resource setups. Experiments and ablation studies on open-source instruction-tuned LLMs, across three different benchmarks, demonstrate the benefits of leveraging interlinear gloss resources for machine translation. GrammaMT improves the translation performance for various low-resource to high-resource languages in the largest existing corpus of IGT data, on the challenging 2023 SIGMORPHON Shared Task data across rarely-seen, endangered languages, and even in an out-of-domain setting within FLORES.
Paper Type: Long
Research Area: Machine Translation
Research Area Keywords: Machine Translation, Large Language Models, In-context Learning, Prompting
Contribution Types: Approaches to low-resource settings
Languages Studied: Gitksan, Lezgi, Natugu, Tsez, Swahili, Yoruba, Icelandic, Marathi, Kannada, Urdu, Thai, Greek, Portuguese, Japanese, Russian, Arabic, English
Submission Number: 5162
Loading