Three-level Compact Caching for Search Engines Based on Solid State Drives
Abstract: The query processing in search engines can be classified into either disk I/O operations and CPU computation operations. Conventional search engine caching methods mostly aimed at reducing disk I/O operations. However, with the new trend of using solid state drives (SSDs) instead of hard disk drives for search engine storage, the bottleneck of query processing shifts from disk I/O to computation. In this paper, we design a three-level compact caching structure suited to SSD-based search engines, incorporating caches for (a) doculets, partial documents which contain previously encountered snippets, (b) fragments, compact data structures containing snippet metadata, and (c) ranked docID lists, which index the top-k documents relevant to a query. This three-level compact caching method aims at reducing the bottleneck of repetitive CPU computation operations. Moreover, by considering different levels of caches as a whole system, the proposed three-level compact caching eliminates the redundancy among cache items. This increases the efficiency of cache space utilization. Experimental results indicate an improvement in query response latency by a factor of around 2.6 when using static caching and 1.3 under dynamic caching.
Loading