Robust Feature Selection using Sparse Centroid-Encoder

Tomojit Ghosh; Michael Kirby

Robust Feature Selection using Sparse Centroid-Encoder

Tomojit Ghosh, Michael Kirby

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Feature Selection, Sparse Centroid-encoder, Non-linear feature Selection, Deep Feature Selection, Multi-class Feature Selection, Iterative Feature Selection

Abstract: We develop a sparse optimization problem for the determination of the total set of features that discriminate two or more classes. This is a sparse implementa- tion of the centroid-encoder for nonlinear data reduction and visualization called Sparse Centroid-Encoder (SCE). We also provide an iterative feature selection al- gorithm that first ranks each feature by its occurrence, and the optimal number of features is chosen using a validation set. The algorithm is applied to a wide vari- ety of data sets including, single-cell biological data, high dimensional infectious disease data, hyperspectral data, image data, and GIS data. We compared our method to various state-of-the-art feature selection techniques, including three neural network-based models (DFS, SG-L1-NN, G-L1-NN), Sparse SVM, and Random Forest. We empirically showed that SCE features produced better classi- fication accuracy on the unseen test data, often with fewer features.

One-sentence Summary: Feature Selection Using Sparse Centroid-encoder.

Supplementary Material: zip

6 Replies

Loading