Abstract: Dealing with missing values is one of the challenges in symbolic regression on many real-world data sets. One of the popular approaches to address this challenge is to use imputation. Traditional imputation methods are usually performed based on the predictive features without considering the original target variable. In this work, a genetic programming-based wrapper imputation method is proposed, which wrappers a regression method to consider the target variable when constructing imputation models for the incomplete features. In addition to the imputation performance, the regression performance is considered for evaluating the imputation models. Genetic programming (GP) is used for building the imputation models and decision tree (DT) is used for evaluating the regression performance during the GP evolutionary process. The experimental results show that the proposed method has a significant advance in enhancing the symbolic regression performance compared with some state-of- the-art imputation methods.
Loading