Abstract: Dimensionality reduction is a key preprocessing step for many applications. Until our knowledge, unsupervised approaches such as PCA and ICA do not take label information of the original data into account, so a supervised approach such as Linear discriminant analysis (LDA) performs better on many classification tasks. Unfortunately, the classical LDA approach has shortcomings, such as the well-known small size problem, the heteroscedastic problem and the (C-1) low rank problem. The (C-1) low rank problem greatly limits the dimension of the extracted features. In addition, the calculation of the between-class and within-class scatter matrices in the classical LDA approach actually only takes account of the Mahalanobis distance like covariance distance of data centers and each data class, so if the dataset has very few classes or the data distribution of each class is not Gaussian-like but has some spatial structure in the feature space instead, classical LDA does not work well. In this paper we propose a dimensionality reduction approach which avoids the limitations of classical LDA and improves handling of the between-class scatter matrix. Our approach approach takes the distribution of data in each class into consideration to calculate the projection matrix. It does not assume that the data distribution of each class approximates Gaussian; each can have its own spatial structure. Experiments show that our method can obtain better projection directions than the classical LDA approach and greatly improve the classification accuracy. In addition, our approach is able to reconstruct the original signal well, while the classical LDA approach ignores the reconstruction property.
0 Replies
Loading