Platform-Independent Robust Query Processing

Srinivas Karthik, Jayant R. Haritsa, Sreyash Kenkre, Vinayaka Pandit, Lohit Krishnan

Published: 01 Jan 2019, Last Modified: 22 Jun 2023IEEE Trans. Knowl. Data Eng. 2019Readers: Everyone

Abstract: To address the classical selectivity estimation problem for OLAP queries in relational databases, a radically different approach called <monospace>PlanBouquet</monospace> was recently proposed in <xref ref-type="bibr" rid="ref1"> [1]</xref> , wherein the estimation process is completely abandoned and replaced with a calibrated discovery mechanism. The beneficial outcome of this new construction is that provable guarantees on worst-case performance, measured as Maximum Sub-Optimality ( MSO ), are obtained thereby facilitating robust query processing. The <monospace> PlanBouquet</monospace> formulation suffers, however, from a systemic drawback—the MSO bound is a function of not only the query, but also the optimizer's behavioral profile over the underlying database platform. As a result, there are adverse consequences: (i) the bound value becomes highly variable, depending on the specifics of the current operating environment, and (ii) it becomes infeasible to compute the value without substantial investments in preprocessing overheads. In this paper, we first present <monospace>SpillBound</monospace> , a new query processing algorithm that retains the core strength of the <monospace>PlanBouquet</monospace> discovery process, but reduces the bound dependency to only the query. It does so by incorporating plan termination and selectivity monitoring mechanisms in the database engine. Specifically, <monospace>SpillBound</monospace> delivers a worst-case multiplicative bound, of <inline-formula><tex-math notation="LaTeX">$D^2+3D$</tex-math></inline-formula> , where <inline-formula> <tex-math notation="LaTeX">$D$</tex-math></inline-formula> is simply the number of error-prone predicates in the user query. Consequently, the bound value becomes independent of the optimizer and the database platform, and the guarantee can be issued simply by query inspection. We go on to prove that <monospace>SpillBound</monospace> is within an <inline-formula> <tex-math notation="LaTeX">$O(D)$</tex-math></inline-formula> factor of the best possible deterministic selectivity discovery algorithm in its class. We next devise techniques to bridge this quadratic-to-linear MSO gap by introducing the notion of contour alignment , a characterization of the nature of plan structures along the boundaries of the selectivity space. Specifically, we propose a variant of <monospace>SpillBound</monospace> , called <monospace>AlignedBound</monospace> , which exploits the alignment property and provides a guarantee in the range <inline-formula><tex-math notation="LaTeX">$\mathbf {[2D+2,D^2+3D]}$</tex-math></inline-formula> . Finally, a detailed empirical evaluation over the standard decision-support benchmarks indicates that: (i) <monospace>SpillBound </monospace> provides markedly superior performance w.r.t. MSO as compared to <monospace>PlanBouquet</monospace> , and (ii) <monospace>AlignedBound</monospace> provides additional benefits for query instances that are challenging for <monospace>SpillBound</monospace> , often coming close to the ideal of MSO linearity in <inline-formula> <tex-math notation="LaTeX">$D$</tex-math></inline-formula> . From an absolute perspective, <monospace>AlignedBound</monospace> evaluates virtually all the benchmark queries considered in our study with MSO of around 10 or lesser. Therefore, in an overall sense, <monospace>SpillBound</monospace> and <monospace>AlignedBound</monospace> offer a substantive step forward in the long-standing quest for robust query processing.

0 Replies