Optimizing Training Data for Image Classifiers

Nikolaos Vasiloglou

07 May 2021OpenReview Archive Direct UploadReaders: Everyone

Abstract: In this paper, we propose a robust method for outlier removal to improve the performance for image classification. Increasing the size of training data does not necessarily raise prediction accuracy, due to instances that may be poor representatives of their respective classes. Four separate experiments are tested to evaluate the effectiveness of outlier removal for several classifiers. Embeddings are generated from a pre-trained neural network, a fine-tuned network, as well as a Siamese network. Subsequently, outlier detection is evaluated based on clustering quality and classifier performance from a fully-connected feed-forward network, K-Nearest Neighbors and gradient boosting model.

0 Replies