Abstract: One of the challenges in the rapid development of Internet of Things (IoT) and edge computing is deploying machine learning (ML) models on resource constrained devices (RCD). Preprocessing methods are important for improving ML model performance, especially when limited computing power is involved. This research examines the effectiveness of four prominent preprocessing methods in conjunction with four common ML algorithms: Support Vector Machines (SVM), Random Forest (RF), Logistic Regression (LR), and K-Nearest Neighbours (KNN). The methods include quantisation, Min-Max scaling, standardisation (Z-score normalisation), and quantile transformation. We aim to examine how different preprocessing techniques affect the performance of various ML algorithms. By doing so, we offer valuable insights into the most effective preprocessing approach for different algorithms and HAR dataset. The results of our experiments show that different combinations of preprocessing methods and ML algorithms can achieve different levels of accuracy, f1 score and training time. Remarkably, quantisation which, is frequently used to minimise the memory footprint of models produced, average results for all techniques i.e. 58%. On the other hand, Min-Max scaling performed better with RF, attaining 94.72% accuracy with a training time of 6.4013 sec, indicating that it can be used in situations where resources are limited. One widespread normalisation method, standardisation, performed well with both RF and SVM, achieving accuracy of 94.56% and 48.66%, respectively. Furthermore, quantile transformation provides promising outcomes with accuracies ranging from 69.51% to 94.68% for all techniques. These results highlight the importance of modifying the discussed four preprocessing methods according to specific ML algorithms, applications, and available RCD such as limited processer resources and limited memory. These results are helpful to researchers who are working to achieve higher model performance in the context of RCD.
External IDs:doi:10.1109/issc61953.2024.10603066
Loading