Domain Query Optimization: Adapting the General-Purpose Database System Hyper for Tableau Workloads

Published: 2019, Last Modified: 12 May 2025BTW 2019EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The Hyper database system was started as an academic project at Technical University Munich. In 2016, the commercial spin-off of the academic Hyper database system was acquired by Tableau, a leader in the analytics and business intelligence (BI) platforms market. As a human-in-the-loop BI platform, Tableau products machine-generate query workloads with characteristics that differ from human-written queries and queries represented in industry-standard database system benchmarks. In this work, we contribute optimizations we developed for one important class of queries typically generated by Tableau products: retrieving (aggregates of) the domain of a column. We devise methods for leveraging the compression of the database column in order to efficiently retrieve the duplicate-free value set, i.e., the domain. Our extensive performance evaluation of a synthetic benchmark and over 60 thousand real-world workbooks from Tableau Public shows that our optimization enables query latencies for domain queries that allow self-service ad-hoc data exploration.
Loading