Zero and Few-Shot Learning Techniques for Cross-lingual Classification Tasks on Arabic and Code-Switched Data

ACL ARR 2024 June Submission1029 Authors

13 Jun 2024 (modified: 03 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Zero-shot and few-shot learning techniques offer promising solutions for addressing data scarcity in Natural Language Processing (NLP), particularly in under-resourced languages such as Arabic and code-switching scenarios. Traditional supervised deep learning methods often struggle in such contexts due to their dependence on extensive labeled data. In this paper, we propose a novel approach that utilizes zero-shot and few-shot learning methodologies for cross-lingual classification tasks, focusing on Named Entity Recognition (NER) in Arabic texts and sentiment analysis in both Arabic and code-switched Arabic-English data. We introduce two approaches, employing Pattern Exploiting Training (PET) and Better-few-shot learning in language models (LM-BFF), which demonstrate versatility across diverse classification tasks. Subsequently, we conduct comprehensive evaluations on NER and sentiment analysis tasks, showcasing the superior performance of LM-BFF, surpassing previous techniques by 1.5\% f1-score in sentiment analysis of code-switched data. This study emphasizes the importance of zero and few-shot learning methodologies in overcoming data scarcity challenges in Arabic NLP and code-switching research, thereby advancing NLP capabilities in under-resourced linguistic contexts.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Zero-shot Learning, Few-shot Learning, Code-Switching, Natural Language Processing
Contribution Types: Model analysis & interpretability, Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: Arabic, Code-Switching Arabic-English
Submission Number: 1029
Loading