An Experimental Study on Federated Equi-Joins

Published: 01 Jan 2024, Last Modified: 10 Feb 2025IEEE Trans. Knowl. Data Eng. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Data federation has emerged as a novel database system enabling collaborative queries across mutually distrusted data owners. Federated equi-join, a commonly used operation in data federation, combines relations from distinct data owners while preserving their data privacy. Due to the wide applications of this query, many solutions to federated equi-joins have been proposed. However, it is still challenging for practitioners to choose the most appropriate algorithm due to various reasons, including incomplete evaluation protocols (e.g., lack of evaluating multi-way equi-joins), under-explored performance metric (main memory usage), and absence of a standardized comparison. Motivated by this reason, this paper conducts a comprehensive experimental study and builds a new benchmark, called ${\sf FEJ-Bench}$ , for federated equi-joins. The experimental study and the benchmark consist of eight state-of-the-art algorithms and five datasets. Our evaluation reveals the query efficiency ranking, its impact factors, and potential research opportunities. Finally, we open-source ${\sf FEJ-Bench}$ on GitHub, which is the first benchmark for federated equi-joins. Our findings aim to guide researchers and practitioners in deploying federated equi-joins in practice.
Loading