Abstract: We consider the dimensionality-reduction problem for a contaminated data set in a very high dimensional space, i.e., the problem of finding a subspace approximation of observed data, where the number of observations is of the same magnitude as the number of variables of each observation, and the data set contains some outlying observations. We propose a High-dimension Robust Principal Component Analysis (HR-PCA) algorithm that is tractable, robust to outliers and easily kernelizable. The resulted subspace has a bounded deviation from the desired one, and achieves optimality in the limit case where the portion of outliers goes to zero.
External IDs:dblp:conf/allerton/XuCM08
Loading