Abstract: Integrating data mining techniques into database systems has gained popularity and its significance is well recognized. However, the performance of SQL based data mining is known to fall behind specialized implementations. Reasons for this are among others the prohibitive nature of the cost associated with extracting knowledge as well as the lack of suitable declarative query language support. Recent studies have found that for association rule mining and sequential pattern mining with carefully tuned SQL formulations it is possible to achieve performance comparable to systems that cache the data in files outside the DBMS. However, most of the previous pattern mining methods follow the method of Apriori, which still encounters problems when a sequential database is large and/or when sequential patterns to be mined are numerous and long.
Loading