Modulo 9 model-based learning for missing data imputationOpen Website

2021 (modified: 13 Jun 2021)Appl. Soft Comput. 2021Readers: Everyone
Abstract: Highlights • A novel method named modulo 9 as a new method of handling missing data. • Demonstrating how missing data can affect machine learning algorithms and the decisions making. • Application of robust machine learning techniques, deletion method and the average. • Performances of the methods on the dataset containing missing data. Abstract Missing Values Management is one of the challenges faced by Data Analysts. Therefore, the creation of effective data models will be the right decision for missing data imputation. However, learning, training, and Data Analysis must be implemented through machine learning algorithms. Missing Data is a problem with no feedback or variables. This problem (missing data) can result in serious Data Analysis, which may eventually lead to erroneous conclusions. This research paper first studies how missing data can affect Machine Learning Algorithms, and decision-making based on the Data Analysis’s output. Secondly, it proposes Modulo 9 as a novel method for handling missing data problems. The proposed novel method is assessed with wide-ranging experiments compared with robust Machine Learning techniques such as Support Vector Machine (SVM) Algorithm, Linear Regression (LR), K-Nearest Neighbors (KNN), Naïve Bayes (NB), Support Vector Classifier (SVC), Linear Support Vector Classifier (LSVC), Random Forest Classifier (RFC), Decision Tree Regressor (DTR), Deletion Method, Multi-Layer Perceptron (MLP), and the Mean Value. The results show that the novel method outperforms the eleven (11) existing methods.
0 Replies

Loading