Index compression is good, especially for random accessOpen Website

Published: 2007, Last Modified: 30 Jan 2024CIKM 2007Readers: Everyone
Abstract: Index compression techniques are known to substantially decrease the storage requirements of a text retrieval system. As a side-effect, they may increase its retrieval performance by reducing disk I/O overhead. Despite this advantage, developers sometimes choose to store index data in uncompressed form, in order to not obstruct random access into each index term's postings list. In this paper, we show that index compression does not harm random access performance. In fact, we demonstrate that, in some cases, random access into a term's postings list may be realized more efficiently if the list is stored in compressed form instead of uncompressed. This is regardless of whether the index is stored on disk or in main memory, since both types of storage - hard drives and RAM - do not support efficient random access in the first place.
0 Replies

Loading