Dynamic and Transparent Memory Sharing for Accelerating Big Data Analytics Workloads in Virtualized CloudDownload PDFOpen Website

2018 (modified: 17 Nov 2022)IEEE BigData 2018Readers: Everyone
Abstract: Many big data applications are memory-intensive workloads and perform iterative analytics algorithms. When the dataset used in each iteration of the analytic job exceeds the physical memory of their allocation, this type of workloads suffers from serious performance degradation or experience out of memory error. Existing proposals focus on estimating working set size for accurate resource allocation of executors, but lack of desired efficiency and transparency. This paper presents an efficient shared-memory based memory paging service, called FastSwap. The design of FastSwap makes a number of original contributions. First, FastSwap improves VM memory swapping performance by leveraging idle host memory and redirecting the VM swapping traffic to the host-guest compressed shared memory swap area. Second, FastSwap develops a compressed swap page table as an efficient index structure to provide high utilization of shared memory swap area through supporting multi-granularity of compression factors. Third, FastSwap provides hybrid swap-out and proactive swap-in to further improve the performance of shared memory swapping. Finally, FastSwap is by design light-weighted and non-intrusive. We evaluate FastSwap using a set of well-known big data analytics workloads and benchmarks, such as Spark, Redis, HiBench, SparkBench and YCSB. The results show that FastSwap offers up to two orders of magnitude performance improvements over existing memory swapping methods and more than four orders of magnitude faster than conventional disk based VM swapping facility.
0 Replies

Loading