A Novel Perspective of Text Classification by Prolog-Based Deductive Databases

Kiet Van Nguyen, Tin Van Huynh, Anh Gia-Tuan Nguyen

2021 (modified: 30 Oct 2022)IEA/AIE (2) 2021Readers: Everyone

Abstract: Natural language processing has been studied extensively worldwide and has been implemented into various applications, including text classification. Especially, the significant development of social networking platforms has led to a considerable increase in data. Thus, it becomes the fertile data domain to carry out a series of studies on text classification. Various studies on this task are conducted in many languages but still have many limitations with Vietnamese. This is why we aim to do this study to classify Vietnamese texts from two Vietnamese benchmark datasets. Despite many studies on machine learning models in this study, any research work using facts and rules in a deductive database to classify Vietnamese text classification has not been studied. In particular, we design a system architecture based on facts and rules in a deductive database for text classification in Vietnamese. Our experiments show our results are positive on two Vietnamese datasets. The best performances from the experiments achieve 93.18% of F1-score for the UIT-ViNames dataset, 76.79% and 69.96% for the sentiment detection and the topic classification on the UIT-VSFC dataset, respectively. Although the experimental results are not better than the previous studies, these results are the premise for developing solutions for natural language processing problems on the deductive database, a successful pilot in implementing text classification on the Prolog-based deductive database.

0 Replies