Abstract: With the massive growth of big data applications, the requirement for data processing speed is getting higher and higher in stream computing systems. The Storm, as one of the most popular distributed stream computing systems, has received more attention. However, the Storm's traditional scheduling strategy is not ideal for processing a large volume of streaming data. The resource scheduling in a distributed stream computing system should consider not only node allocation status but also fluctuating input rates of data stream. To address this problem, this paper has completed the following work: (1) A performance model La-Stream (latency-sensitive elastic adaptive scheduling) is proposed and built by adopting a quantitative method for calculating the amount of computation required between task map nodes and node communication. (2) A La-Stream based algorithm is proposed. The algorithm dynamically plans a resource allocation scheme with minimal data processing latency among available resources to achieve optimal allocation. (3) Three functional modules of La-steam are proposed and implemented: module Monitor, module Optimizer and module Scheduler. The three modules are integrated into the Storm platform with minimal overhead. Several sets of experiments are conducted, verifying the feasibility and effectiveness of La-Stream.
0 Replies
Loading