Model-Based Outlier Detection for Object-Relational DataDownload PDFOpen Website

Published: 2015, Last Modified: 12 May 2023SSCI 2015Readers: Everyone
Abstract: This paper extends unsupervised statistical outlier detection to the case of object-relational data, based on probabilistic modeling. Object-relational data represent a complex heterogenous network, which comprises objects of different types, links among these objects, also of different types, and attributes of these links. This special structure prohibits a direct vectorial data representation. We apply state-of-the-art probabilistic modelling techniques for object-relational data that construct a graphical model (Bayesian network), which compactly represents probabilistic associations in the data. We propose a new metric, based on the learned object-relational model, that quantifies the extent to which the individual association pattern of a potential outlier deviates from that of the whole population. The metric is based on the likelihood ratio of two parameter vectors: One that represents the population associations, and another that represents the individual associations. Our method is validated on synthetic datasets and on real-world data sets about soccer matches and movies. Compared to baseline methods, our novel transformed likelihood ratio achieved the best detection accuracy on all datasets.
0 Replies

Loading