Extracting Search Query Patterns via the Pairwise Coupled Topic Model

Published: 01 Jan 2016, Last Modified: 03 Oct 2024WSDM 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: A fundamental yet new challenge in information retrieval is the identification of patterns behind search queries. For example, the query "NY restaurant" and "boston hotel" shares the common pattern "LOCATION SERVICE". However, because of the diversity of real queries, existing approaches require data preprocessing by humans or specifying the target query domains, which hinders their applicability.We propose a probabilistic topic model that assumes that each term (e.g., "NY") has a topic (LOCATION). The key idea is that we consider topic co-occurrence in a query rather than a topic sequence, which significantly reduces computational cost yet enables us to acquire coherent topics without the preprocessing. Using two real query datasets, we demonstrate that the obtained topics are intelligible by humans, and are highly accurate in keyword prediction and query generation tasks.
Loading