GenDesc: A Partial Generalization of Linguistic Features for Text Classification

Published: 2013, Last Modified: 04 Oct 2025NLDB 2013EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper presents an application that belongs to automatic classification of textual data by supervised learning algorithms. The aim is to study how a better textual data representation can improve the quality of classification. Considering that a word meaning depends on its context, we propose to use features that give important information about word contexts. We present a method named GenDesc, which generalizes (with POS tags) the least relevant words for the classification task.
Loading