Keywords: NLP, Morphological analysis, Word segmentation, Transliteration, Endangered languages.
Abstract: Torwali [ISO 639-3: trw] is an endangered and indigenous language spoken in North of Pakistan. It is a low-resource language written in
RTL Perso-Arabic script. This paper discusses the challenges and approaches in processing of Torwali with various NLP techniques to develop tools and resources. This work contributes towards morphological analysis, word segmentation, POS tagging and transliteration
of Torwali. This work, on which this paper is based, can be used as a resource for other lexically similar endangered languages of northern Pakistan and will help to improve the digital presence of Torwali language and will safeguard it against endangerment.
Submission Number: 9
Loading