Perturbation-based oversampling technique for imbalanced classification problemsDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 13 May 2023Int. J. Mach. Learn. Cybern. 2023Readers: Everyone
Abstract: We present a simple yet effective idea, perturbation-based oversampling (POS), to tackle imbalanced classification problems. In this method, we perturb each feature of a given minority instance to generate a new instance. The originality and advantage of the POS is that a hyperparameter p is introduced to control the variance of the perturbation, which provides flexibility to adapt the algorithm to data with different characteristics. Experimental results yielded by using five types of classifiers and 11 performance metrics on 103 imbalanced datasets show that the POS offers comparable or better results than those yielded by 11 reference methods in terms of multiple performance metrics. An important finding of this work is that a simple perturbation-based oversampling method is able to yield better classification results than many advanced oversampling methods by controlling the variance of input perturbation. This reminds us it may need to conduct comparisons with simple oversampling methods, e.g., POS, when designing new oversampling approaches.
0 Replies

Loading