Partial Materialization for Data Integration in SQL-on-Hadoop Engines

Published: 01 Jan 2016, Last Modified: 09 Aug 2024ICITCS 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: SQL-on-Hadoop engines are considered as useful data integration tools for large-scale data. However, they may incur redundant network overhead by redistributing the intermediate results multiple times in the cases where a number of attributes are included in the query result. We propose an optimization method using partial materialization which avoids repetitively redistributing trivial attributes.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview