Accurate stemming of Dutch for text classificationOpen Website

08 Jun 2023OpenReview Archive Direct UploadReaders: Everyone
Abstract: This paper investigates the use of stemming for classification of Dutch (email) texts. We introduce a stemmer, which combines dictionary lookup (implemented efficiently as a finite state automaton) with a rule-based backup strategy and show that it outperforms the Dutch Porter stemmer in terms of accuracy, while not being substantially slower.
0 Replies

Loading