A High-Dimensional Outlier Detection Algorithm Base on Relevant Subspace

Published: 01 Jan 2017, Last Modified: 17 Apr 2025DASC/PiCom/DataCom/CyberSciTech 2017EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Outlier detection in high-dimensional big data is an important data mining task to distinguish outliers from regular objects. In tradition, outlier detection approaches miss outliers which hide in full data space. However, these methods are deteriorated due to the notorious "curse of dimensionality" which leads to distance cannot express the deviation of outlier and normal objects, and the exponential computation leads to low efficiency. In this paper, we propose an outlier detection method based on relevant subspace, which can effectively describe the local distribution of objects and detect outliers hidden in subspaces of the data. In thorough experiments on synthetic data and real data, it shows that the method outperforms competing outlier ranking approaches by detecting outliers in subspace.
Loading