Improving the Feature Selection Stability of the Delta Test in Regression

Published: 01 Jan 2024, Last Modified: 16 May 2025IEEE Trans. Artif. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Feature selection is an important preprocessing step that helps to improve model performance and to extract knowledge about important features in a dataset. However, feature selection methods are known to be adversely impacted by changes in the training dataset: even small differences between input datasets can result in the selection of different feature sets. This letter tackles this issue in the particular case of the delta test (DT), a well-known feature relevance criterion that approximates the noise variance for regression tasks. A new feature selection criterion is proposed, the delta test bar, which is shown to be more stable than its close competitors. Impact Statement Feature selection makes it possible to identify the attributes that play an important role in predicting a target. However, some feature selection methods, such as the delta test, suffer from instability. As a result, it is difficult to trust that the features selected are the most relevant ones. The method we present in this letter improves the stability of the delta test, thereby increasing the trustworthiness of the feature selection procedure.
Loading