Knowledge Distillation of BERT Language Model on the Arabic Language

Hager Adil; Abrar Elidrisi; Tahani Attia; Muhammed Saeed

Knowledge Distillation of BERT Language Model on the Arabic Language

Hager Adil, Abrar Elidrisi, Tahani Attia, Muhammed Saeed

01 Mar 2023 (modified: 14 Nov 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone

Keywords: BERT, Knowledge Distillation, DistilBERT, Arabic DistilBERT, Arabic Language models, Language models, NLP

TL;DR: Arabic DistilBERT Language Model

Abstract: The absence of good Arabic language models led to significant setbacks in the Arabic language related tasks and lag with respect to robustness and accuracy. While a pre-trained version of BERT on Arabic language is available, a smaller distilled version could be proven to be highly scalable. In this research paper, we propose the development of a smaller and more efficient version of BERT, known as DistilBERT for the Arabic language for the pursuit of achieving comparable results with significantly less computational resources. Employing knowledge distillation to create a compact model allows for wider implementation, even in areas with limited computational resources. Ultimately, this project aims to break down language barriers, bring greater inclusivity and improve the accessibility of the Arabic language in NLP applications worldwide. This project serves as a starting point for further research and investigation of the performance of the Arabic DistilBERT model across various NLP tasks.

5 Replies

Loading