Abstract: Our research challenge is to provide a mechanism for splitting into user task-based sessions a long-term log of queries submitted to a Web Search Engine (WSE). The hypothesis is that some query sessions entail the concept of user task. We present an approach that relies on a centroid-based and a density-based clustering algorithm, which consider queries inter-arrival times and use a novel distance function that takes care of query lexical content and exploits the collaborative knowledge collected by Wiktionary and Wikipedia.
0 Replies
Loading