Applying an existing machine learning algorithm to text categorizationOpen Website

1995 (modified: 10 Nov 2022)Learning for Natural Language Processing 1995Readers: Everyone
Abstract: The information retrieval community is becoming increasingly interested in machine learning techniques, of which text categorization is an application. This paper describes how we have applied an existing similarity-based learning algorithm, Charade, to the text categorization problem and compares the results with those obtained using decision tree construction algorithms. From a machine learning point of view, this study was motivated by the size of the inspected data in such applications. Using the same representation of documents, Charade offers better performance than earlier reported experiments with decision trees on the same corpus. In addition, the way in which learning with redundancy influences categorization performance is also studied.
0 Replies

Loading