Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers
Abstract: Highlights•The text preprocessing techniques available in the literature are discussed.•The impact of the three most common techniques on SOTA models is evaluated.•Text preprocessing can significantly affect the performance of Transformers.•Traditional classifiers can outperform Transformers, using appropriate preprocessing.•The proper preprocessing should be based on the models and datasets considered.
Loading