Tornado: A System For Real-Time Iterative Analysis Over Evolving Data

Published: 2016, Last Modified: 17 May 2024SIGMOD Conference 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: There is an increasing demand for real-time iterative analysis over evolving data. In this paper, we propose a novel execution model to obtain timely results at given instants. We notice that a loop starting from a good initial guess usually converges fast. Hence we organize the execution of iterative methods over evolving data into a main loop and several branch loops. The main loop is responsible for the gathering of inputs and maintains the approximation to the timely results. When the results are requested by a user, a branch loop is forked from the main loop and iterates until convergence to produce the results. Using the approximation of the main loop, the branch loops can start from a place near the fixed-point and converge quickly. Since the inputs not reflected in the approximation is concerned with the approximation error, we develop a novel bounded asynchronous iteration model to enhance the timeliness. The bounded asynchronous iteration model can achieve fine-grained updates while ensuring correctness for general iterative methods. Based on the proposed execution model, we design and implement a prototype system named Tornado on top of Storm. Tornado provides a graph-parallel programming model which eases the programming of most real-time iterative analysis tasks. The reliability is also enhanced by provisioning efficient fault tolerance mechanisms. Empirical evaluation conducted on Tornado validates that various real-time iterative analysis tasks can improve their performance and efficiently tolerate failures with our execution model.
Loading