Abstract: Question classification has an important role in Question Answering systems by determining the expectation for the correct answer to the question. The question classification system aims to determine the question type and based on it can be given the correct answer for the question. This paper presents the first questions annotated corpus in Albanian language using a six-classes tag-set. The annotated corpus is used to train and test three machine learning algorithms for question classification task using a variety of metrics. Furthermore, we conclude our work by analyzing and comparing the performance of the implemented models and giving direction for future improvement of the models. In terms of accuracy, the best performant algorithm is SVM.
Loading