Large-scale cluster-based retrieval experiments on Turkish texts

Ismail Sengör Altingövde, Rifat Ozcan, Huseyin Cagdas Ocalan, Fazli Can, Özgür Ulusoy

2007 (modified: 11 Nov 2022)SIGIR 2007Readers: Everyone

Abstract: We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.

0 Replies