Abstract: In query processing, incorporating proximity between query terms is beneficial for effective retrieval. However, it brings inevitable storage and computing costs by using positional data in inverted indexes. In this paper, we propose a lossy method for compressing term position data in the case of utilizing term proximity. Our method exploits clustering property of term occurrences, adaptively clusters the nearby occurrences, and replaces the clustered positions with a centralized value. Experimental results show that our adaptive method is competitive with respect to index size, ranking efficiency and effectiveness.
0 Replies
Loading