Abstract: Feature selection is an efficient strategy to reduce the dimensionality of data and removing the noise in text categorization. However, most feature selection methods aim to remove non-informative features based on corpus statistics, which do not relate to the classification accuracy directly. In this paper, we propose an effective feature selection method, which aims at the classification accuracy of KNN. Our experiments show that our method is better than the traditional methods, and it is also beneficial to other classifiers, such as Support Vector Machines (SVM).
0 Replies
Loading