Abstract: Achieving efficient query optimization is crucial in database management systems (DBMS). In recent years, machine learning models have been widely applied to query optimizers. These models can learn from executed plans to generate and select better query plans. Existing research has demonstrated that learned query optimizers can exhibit significant advantages in relevant query benchmarks after sufficient training. However, the substantial training overhead and unpredictable performance regressions limit their practicality. In this paper, we propose DORA, which employs a DOuble-hints strategy to guide the generation of candidate plans and uses a Reliability-Associated learning-to-rank(LTR) model to select the optimal plan. DORA combines operator hint and leading hint to enable the query optimizer to generate richer candidate plans. It then trains a LTR model to evaluate these candidate plans and clusters them in a fine-grained manner. Each cluster is associated with a reliability value to reflect the accuracy of model’s predictions. We filter out plans in low reliability clusters, thereby eliminating the risk of model regression. Our preliminary experiments show that DORA performs better than the traditional query optimizer in PostgreSQL and can match or even outperform state-of-the-art learned optimizers. Furthermore, DORA can also exhibit excellent optimization performance without fully learning all training data.
Loading