Abstract: Pattern mining provides useful tools for exploratory data analysis. Numerous efficient algorithms exist that are able to discover various types of patterns in large datasets. However, the problem of identifying patterns that are genuinely interesting to a particular user remains challenging. Current approaches generally require considerable data mining expertise or effort and hence cannot be used by typical domain experts. We show that it is possible to resolve this issue by interactive learning of user-specific pattern ranking functions, where a user ranks small sets of patterns and a general ranking function is inferred from this feedback by preference learning techniques. We present a general framework for learning pattern ranking functions and propose a number of active learning heuristics that aim at minimizing the required user effort. In particular we focus on Subgroup Discovery, a specific pattern mining task. We evaluate the capacity of the algorithm to learn a ranking of a subgroup set defined by a complex quality measure, given only reasonably small sample rankings. Experiments demonstrate that preference learning has the capacity to learn accurate rankings and that active learning heuristics help reduce the required user effort. Moreover, using learned ranking functions as search heuristics allows discovering subgroups of substantially higher quality than those in the given set. This shows that active preference learning is potentially an important building block of interactive pattern mining systems.
0 Replies
Loading