MDS array codes with efficient repair and small sub-packetization level

Published: 01 Jan 2024, Last Modified: 25 Jan 2025Des. Codes Cryptogr. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Modern data centers use erasure codes to provide high storage efficiency and fault tolerance. Reed–Solomon code is commonly deployed in large-scale distributed storage systems due to its ease of implementation, but it consumes massive bandwidth during node repair. Minimum storage regenerating (MSR) codes is a class of maximum distance separable (MDS) codes that achieve the lower bound on repair bandwidth. However, an exponential sub-packetization level is inevitable for MSR codes, resulting in massive disk I/O consumption during node repair. Disk I/O is becoming the bottleneck of the performance in data centers where the storage system needs to frequently provide high-speed data access to clients. In this paper, we consider disk I/O as an important metric to evaluate the performance of a code and construct MDS array codes with efficient repair under small sub-packetization level. Specifically, two explicit families of MDS codes with efficient repair are proposed at the sub-packetization level of \({\mathcal {O}}(r)\), where r denotes the number of parities. The first family of codes are constructed over a finite field \({\mathbb {F}}_{q^m}\) where \(q \ge n\) is a prime power, \(m > r(l-1) +1\), n and l denote the code length and sub-packetization level, respectively. The second family of codes are built upon a special binary polynomial ring where the computation operations during node repair and file reconstruction are only XORs and cyclic shifts, avoiding complex multiplications and divisions over large finite fields.
Loading