Abstract: The Log-Structured Merge-Tree has efficient writing performance and performs well in big data scenarios. An LSM-tree transforms random writes into batch sequential writes through the design of a multilayer storage structure. However, as the core operation, the compaction inevitably results in degrading periodically in the read performance. Regular but irregular data compaction operations make the cache challenging to track the access information of data blocks. This work studies how to address the cache invalidation problem. We propose a two-phase parallel prefetching approach, which can effectively improve the cache invalidation when the compaction occurs. Our experimental results show our method can effectively improve read performance.
Loading