D2-Tree: A Distributed Double-Layer Namespace Tree Partition Scheme for Metadata Management in Large-Scale Storage Systems

Published: 2018, Last Modified: 15 Jan 2026ICDCS 2018EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The behavior of metadata server (MDS) cluster is critically important to the overall performance of today's petabyte-scale or even exabyte-scale distributed file system. How to maintain a high level of both system locality and load balancing is a significant challenge to MDS clusters. However, traditional metadata management schemes, including hash-based mapping and subtree partitioning, have severe bias on either system locality or load balancing. In this paper, we propose D 2 -Tree, a distributed double-layer namespace tree partition scheme, for metadata management in large-scale storage systems. The innovative idea is to design a greedy strategy to split the namespace tree into global layer and local layer subtrees, of which global layer is replicated to maintain load balancing and the lower-half subtrees are allocated separately to MDS's by a mirror division method to preserve locality. Both theoretical analysis based on empirical cumulative distribution and extensive experiments are provided to validate the efficiency of D 2 -Tree. Experiments using actual trace data on Amazon EC2 also exhibit the superior performance of D 2 -Tree compared with much previous literature.
Loading