GOM-Hadoop: A distributed framework for efficient analytics on ordered datasets

Published: 2015, Last Modified: 06 Feb 2025J. Parallel Distributed Comput. 2015EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We generalize a class of big data analytics workload (Re-Org) on ordered datasets.•We propose a novel distributed mechanism for efficiently executing Re-Org tasks.•The proposed mechanism is implemented in a distributed framework by extending Hadoop.•A model is presented to formally study the proposed framework.•Experiments show that our framework is 6.3x faster than vanilla Hadoop.
Loading