Abstract: Outlier detection, a data mining technique to detect rare events, deviant objects, and exceptions from data, has drawn increasing attention in recent years. Much existing research targets record data constructed with numerical attributes or a set of points having numeric values. However, very few studies have attempted to detect outliers from data having items. We focus on transaction data and propose a framework for detecting outlier transactions that behave abnormally compared to others. As an outlier, we are interested in a transaction t in which more items are not observed even though they should normally have a strong dependency on item sets in t. We use information of association rules with high confidence for the outlier degree calculation. In this paper, we first discuss what outliers of transactions are, and provide an outlier degree for systematically detecting outlier transactions. We also propose algorithms for efficiently detecting outlier transactions from transaction databases. We present two devices for faster detection that (i) remove redundant association rules and (ii) prune candidates of outlier transactions utilizing maximal frequent itemsets. In experiments using synthetic and real world data sets, we show that our proposal can derive enough detection accuracies and detect outlier transactions faster than a brute force algorithm.
Loading