Abstract: Join order selection has been widely studied, and the widely used algorithm to find the optimal join order is Dynamic Programming (DP). However, it is also known that the existing DP algorithms cannot deal with the so-called interesting order (e.g., sort order), or the algorithm to consider sort order together with joins will violate the optimal substructure behind DP. As a result, it is difficult for DBMSs to find the optimal join order given sort orders, as it comes with extremely high overhead. In this paper, we study a novel DP algorithm to find the optimal join order by taking sort orders into consideration. We call it a join&sort orders selection problem, which is to minimize the total join&sort cost to process a join query. This problem is challenging, because both the join order selection and the sort order selection for a given join-tree are known to be NP-hard. In addition, join&sort orders are dependent in the sense that the change of one order affects the selection of the other. We show that the optimal substructure exists in dealing with join&sort orders selection by DP under some simple condition, which we call \(\varOmega \)-condition. The \(\varOmega \)-Condition is not a condition to restrict join queries to optimize, but is a condition that allows us to find the optimal for any join queries. We present DP algorithms for bushy and linear join trees, discussing the pruning techniques and the complexity of the algorithms. We conduct extensive experimental studies to show the efficiency and robustness of our approach.
Loading