A Comparative Analysis of Dimensionality Reduction Methods for Genetic Programming to Solve High-Dimensional Symbolic Regression Problems

Published: 01 Jan 2021, Last Modified: 16 May 2025SMC 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Genetic Programming (GP) is a powerful evolutionary algorithm that has a wide range of real-world applications. High-dimensional symbolic regression (HDSR) is an important yet challenging application of GP. In this paper, a comparative study is conducted to investigate and to discuss the effectiveness of dimensionality reduction (DR) techniques in assisting GP for HDSR problems. Three popular DR techniques, which are the Pearson Correlation Coefficient (PCC), the Principal Component Analysis (PCA), and the Maximal Information Coefficient (MIC), are selected for comparison and discussion. The experimental results showed that considering only correlation during DR is not effective enough to provide a suitable reduced set of problem dimensions, and that GP with DR may perform worse than its counterpart without DR. Meanwhile, we propose a novel two-phase DR method, considering both correlation and redundancy. The proposed method can give a more reasonable set of reduced dimensions, which can effectively improve the performance of GP on HDSR problems.
Loading