Abstract: Neural headline generation (NHG) has achieved impressive performance on the abstractive headline generation task in recent years. However, there was a lack of studies conducted to enhance NHGs for agglutinative morphologically rich languages. Part of this is due to the scarcity of resources. This work presents a dataset for the Arabic abstractive headline generation task. The dataset was used in experiments conducted to compare different word segmentation methods like morphological word segmentation and Subword Regularization (SR). Experimental results show that morphological word segmentation and SR outperforms or is at par with the state-of-the-art Byte Pair Encoding (BPE) quantitatively and qualitatively.
0 Replies
Loading