Abstract: Finding a small set of representative tuples from a large database is an important functionality for supporting multi-criteria decision making. Top- <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> queries and skyline queries are two widely studied queries to fulfill this task. However, both of them have some limitations: a top- <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> query requires the user to provide her utility functions for finding the <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> tuples with the highest scores as the result; a skyline query does not need any user-specified utility function but cannot control the result size. To overcome their drawbacks, the <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> -regret minimization query was proposed and received much attention recently, since it does not require any user-specified utility function and returns a fixed-size result set. Specifically, it selects a set <inline-formula><tex-math notation="LaTeX">$R$</tex-math></inline-formula> of tuples with a pre-defined size <inline-formula><tex-math notation="LaTeX">$r$</tex-math></inline-formula> from a database <inline-formula><tex-math notation="LaTeX">$D$</tex-math></inline-formula> such that the <i>maximum <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula>-regret ratio</i> , which captures how well the top-ranked tuple in <inline-formula><tex-math notation="LaTeX">$R$</tex-math></inline-formula> represents the top- <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> tuples in <inline-formula><tex-math notation="LaTeX">$D$</tex-math></inline-formula> for any possible utility function, is minimized. Although there have been many methods for <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> -regret minimization query processing, most of them are designed for static databases without tuple insertions and deletions. The only known algorithm to process continuous <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> -regret minimization queries (C <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> RMQ) in dynamic databases suffers from suboptimal approximation and high time complexity. In this paper, we propose a novel dynamic coreset-based approach, called <small>DynCore</small> , for C <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> RMQ processing. It achieves the same (asymptotically optimal) upper bound on the maximum <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> -regret ratio as the best-known static algorithm. Meanwhile, its time complexity is sublinear to the database size, which is significantly lower than that of the existing dynamic algorithm. The efficiency and effectiveness of <small>DynCore</small> is confirmed by experimental results on real-world and synthetic datasets.
0 Replies
Loading