JG2Time: A Learned Time Estimator for Join Operators Based on Heterogeneous Join-GraphsOpen Website

Published: 2023, Last Modified: 13 Feb 2024DASFAA (1) 2023Readers: Everyone
Abstract: The join operator is one of the key operators in RDBMS, and estimating its evaluation time is a fundamental task in query optimization, scheduling, etc. However, it is hard to make a precise estimation, which is not only related with the physical join implementations (hash, sort, loop) but also with the corresponding parameters, like the size of the data, the number of partitions, the number of threads in a modern hash join. Existing works rely on the time complexity analysis but yield rough results, or employ machine learning techniques to build a predictive model but require many training instances. In this paper, we propose a method, named JG2Time, to estimate the running time using the join-graphs constructed from the source codes. Specifically, we construct a heterogonous join-graph by annotating parameter nodes to a call-graph generated by running time analysis tools, and propose ReGAT, a heterogonous graph neural network, to fully capture the edge weights (the number of function calls) in the join-graph. The embeddings learned from ReGAT can be used to predict the running time. In addition, we optimize JG2Time with a multi-task model that also predicts the times of function calls, and an unsupervised code learning method to enhance its generalization. The experimental results illustrate the effectiveness of JG2Time and its optimization strategies.
0 Replies

Loading