Correlation-based Attribute Outlier Detection in XMLDownload PDFOpen Website

Published: 2008, Last Modified: 12 May 2023ICDE 2008Readers: Everyone
Abstract: Compared to relational data models, the hierarchical structure of semi-structured data such as XML provides semantically meaningful neighbourhoods advancing data cleaning problems such as outlier detection. In this paper, we introduce the concept of correlated subspace that leverages on the hierarchical relationships between XML attributes to provide contextually informative neighbourhoods for attribute outlier detection. We also design two correlation-based attribute outlier metrics for XML, namely the xO-Measure and xQ-Measure. The effectiveness of our XML outlier detection approach is supported with experimental results.
0 Replies

Loading