Feature subset selection by SVM ensemble

Tao Ban; Daisuke Inoue

Feature subset selection by SVM ensemble

Tao Ban, Daisuke Inoue

Published: 01 Jan 2016, Last Modified: 07 May 2025SSCI 2016EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Feature selection (FS) has proven to be useful to improve the generalization performance of classifiers. For applications with a small number of instances but a large number of input features, FS methods based on single classifier evaluation are subject to instability.We propose a new FS algorithm based on SVM ensemble learning. First, an ensemble of SVM classifiers are trained with re-sampled subsets of the training data. Then, with a predefined feature ranking criterion, a new stability criterion is defined on the ranking criterion values among the classifiers to measure the relevance of a certain feature. This measure favors the features which have stable ranking criterion values over the features whose ranking criterion values are subject to large variations. The unstable features usually do not have much relative information to the class label, and can be removed to improve the generalization performance of the classifier. To rank the features, the method only requires a small number of SVM classifiers to be trained. It is very fast to solve feature selection problems with a large number of input features. Combined with a backward elimination procedure, this method is robust to feature selection problems with very small sample sizes. In this paper, we evaluate its performance on nonlinear selection tasks.

Loading