Locality-Adaptive Parallel Hash Joins Using Hardware Transactional Memory

Published: 01 Jan 2016, Last Modified: 13 Nov 2024ADMS/IMDM@VLDB 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Previous work [1] has claimed that the best performing implementation of in-memory hash joins is based on (radix-)partitioning of the build-side input. Indeed, despite the overhead of partitioning, the benefits from increased cache-locality and synchronization free parallelism in the build-phase outweigh the costs when the input data is randomly ordered. However, many datasets already exhibit significant spatial locality (i.e., non-randomness) due to the way data items enter the database: through periodic ETL or trickle loaded in the form of transactions. In such cases, the first benefit of partitioning — increased locality — is largely irrelevant. In this paper, we demonstrate how hardware transactional memory (HTM) can render the other benefit, freedom from synchronization, irrelevant as well.
Loading