Keywords: BERT, Knowledge Distillation, DistilBERT, Arabic DistilBERT, Arabic Language models, Language models, NLP
TL;DR: Arabic DistilBERT Language Model
Abstract: The absence of good Arabic language models led to significant setbacks in the Arabic language related tasks and lag with respect to robustness and accuracy. While a pre-trained version of BERT on Arabic language is available, a smaller distilled version could be proven to be highly scalable. In this research paper, we propose the development of a smaller and more efficient version of BERT, known as DistilBERT for the Arabic language for the pursuit of achieving comparable results with significantly less computational resources. Employing knowledge distillation to create a compact model allows for wider implementation, even in areas with limited computational resources. Ultimately, this project aims to break down language barriers, bring greater inclusivity and improve the accessibility of the Arabic language in NLP applications worldwide. This project serves as a starting point for further research and investigation of the performance of the Arabic DistilBERT model across various NLP tasks.