Parallelizing Probabilistic Streaming Skyline Operator in Cloud Computing Environments

Published: 2013, Last Modified: 02 Mar 2026COMPSAC 2013EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The skyline query processing over uncertain data streams has received considerable attention, due to its importance in helping users make intelligent decisions over complex data. Nevertheless, existing studies only focus on retrieving the skylines over data streams in a centralized environment typically with one processor, which limits the scalability of algorithms and cannot meet the requirement for massive data analysis. The emerging cloud computing environment provides much more reliable and stable environments than the traditional distributed environments, which can be well adapted to the massive data management and complex queries. Unfortunately, existing parallel frameworks in clouds such as MapReduce and its variants are not suitable for the skyline queries over uncertain data streams. In this paper, we propose a general framework for parallelizing the probabilistic streaming skyline operator with the sliding window partitioning. Particularly, we propose four items mapping strategies CMS, AMS, DMS and APS to optimize the queries based on the proposed parallel framework. Extensive experiments with real deployment are conducted to demonstrate the effectiveness and efficiency of the proposals.
Loading