On the Evaluation and Comparison of Taggers: the Effect of Noise in Testing CorporaDownload PDF

1998 (modified: 16 Jul 2019)COLING-ACL 1998Readers: Everyone
Abstract: This paper addresses the issue of POS tagger evaluation. Such evaluation is usually performed by comparing the tagger output with a reference test corpus, which is assumed to be error-free. Currently used corpora contain noise which causes the obtained performance to be a distortion of the real value. We analyze to what extent this distortion may invalidate the comparison between taggers or the measure of the improvement given by a new system. The main conclusion is that a more rigorous testing experimentation setting/designing is needed to reliably evaluate and compare tagger accuracies.
0 Replies

Loading