The Impact of Data Normalization on the Accuracy of Machine Learning Algorithms: A Comparative Analysis

Kelsy Cabello-Solorzano; Isabela Ortigosa de Araujo; Marco Peña; Luís Correia; Antonio J. Tallón-Ballesteros

The Impact of Data Normalization on the Accuracy of Machine Learning Algorithms: A Comparative Analysis

Kelsy Cabello-Solorzano, Isabela Ortigosa de Araujo, Marco Peña, Luís Correia, Antonio J. Tallón-Ballesteros

Published: 01 Jan 2023, Last Modified: 22 Aug 2024SOCO (2) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In Machine Learning (ML) algorithms, data normalization plays a fundamental role. This research focuses on analyzing and comparing the impact of various normalization techniques. Three normalization techniques, namely Min-Max, Z-Score, and Unit Normalization, were applied as a preliminary step before using various ML algorithms. In the case of Min-Max we used two variants, one normalizing feature values in the interval [0, 1] and the other normalizing them in the interval \([-1,1]\). The objective of this study is to determine, in a precise and informed manner, the most appropriate normalization technique for each algorithm, aiming to enhance accuracy in problem-solving. Through this comparative analysis, we aim to provide reliable recommendations for improving the performance of ML algorithms through proper data normalization. The results reveal that a few algorithms are virtually unaffected by whether normalization is used or not, regardless of the applied normalization technique. These findings contribute to the understanding of the relationship between data normalization and algorithm performance, allowing practitioners to make informed decisions regarding normalization techniques when using ML algorithms.

Loading