Abstract: Deploying deduplication for distributed primary storage is a sophisticated and challenging task, considering that the demands of low read/write latency, stable read/write performance, and efficient space saving are all of paramount importance. Unfortunately, existing schemes cannot present a satisfactory solution for the aforementioned requirements simultaneously. In this article, we propose D <inline-formula><tex-math notation="LaTeX">$^{3}$</tex-math></inline-formula> , a dynamic dual-phase deduplication framework for distributed primary storage. Several major innovations are established in D <inline-formula> <tex-math notation="LaTeX">$^{3}$</tex-math></inline-formula> . First, we formulate a deduplication-oriented taxonomy called <i>Dedup-Type </i> , to group data with similar deduplication-related characteristics into larger categories. It serves as coarse-grained filter and one of the prioritizing references in D <inline-formula><tex-math notation="LaTeX">$^{3}$ </tex-math></inline-formula> . Second, D <inline-formula><tex-math notation="LaTeX">$^{3}$</tex-math></inline-formula> is a dual-phase framework—inline-phase and offline-phase deduplication processes work in concert with each other. Third, D <inline-formula><tex-math notation="LaTeX">$^{3}$</tex-math></inline-formula> operates in a dynamic manner. We design two critical mechanisms: <i>context-aware threshold adjustment</i> (CTA) for local inline-phase deduplication, and <i>deferred priority-based enforcement</i> (DPE) for global offline-phase deduplication. The CTA mechanism enables selective deduplication under a periodically updated threshold. Data skipped during the inline phase is regarded as a candidate for offline phase, and is handled in a prioritized order under the governance of DPE mechanism. Evaluation results demonstrate that, compared with conventional inline and offline deduplication schemes, D <inline-formula><tex-math notation="LaTeX">$^{3}$</tex-math></inline-formula> achieves more efficient and stabler read/write performance with competitive space saving.
0 Replies
Loading