Abstract: Label projection, which involves obtaining translated labels and texts jointly, is essential for leveraging machine translation to facilitate cross-lingual transfer in structured prediction tasks. Prior research exploring label projection often compromises translation accuracy in favor of simplified label identification or suffers from inaccuracies by relying solely on word alignment for constructing label phrases. In this paper, we introduce a novel label projection approach, CLAP, which translates text to the target language and performs contextual translation on the labels using the translated text as the context, ensuring better accuracy for the translated labels. We leverage instruction-tuned language models with multilingual capabilities as our contextual translator, imposing the constraint of the presence of translated labels in the translated text via instructions. We compare CLAP with other label projection techniques on zero-shot cross-lingual transfer across 39 languages on two representative structured prediction tasks - event argument extraction (EAE) and named entity recognition (NER). Experiments reveal that CLAP improves by 1.7 F1 points for EAE and by 1.4 F1 points for NER.
Paper Type: long
Research Area: Information Extraction
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: Afrikaans, Arabic, Bulgarian, Bengali, German, Greek, English, Spanish, Estonian, Basque, Farsi, Finnish, French, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Javanese, Georgian, Kazakh, Korean, Malayalam, Marathi, Malay, Burmese, Dutch, Portuguese, Russian, Swahili, Tamil, Telugu, Thai, Tagalog, Turkish, Urdu, Vietnamese, Yoruba, Chinese
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies
Loading