Abstract: Erasure codes are widely advocated as a viable means to ensure the dependability of key-value storage systems for big data applications (e.g., MapReduce). They separate user data to several data splits, encode data splits to generate parity splits, and store these splits in storage nodes. Reducing the disk Input and Output (I/O) latency is a well-known challenge to enhance the performance of erasure coding based storage systems. In this paper, we consider the problem of reducing the latency of read operations by caching splits in the memory of storage nodes. We find the key to solve this problem is that storage nodes need to cache enough splits in the memory, so that the application server can reconstruct the objects without reading data from disks. We design an efficient memory caching scheme, namely ECCS. The theoretical analysis verifies that ECCS can effectively reduce the latency of read operations. Accordingly, we implement a prototype storage systems to deploy our proposal. The extensive experiments are conducted on the prototype with the real-world storage cluster and traces. The experimental results show that our proposal can reduce the time of read operations by up to 32% and improve the throughput of read operations by up to 48% compared with current caching approaches.
External IDs:dblp:conf/bdccf/ShenLSZW18
Loading